dcn-as logo
Story image

Google Cloud rolls out Cloud Dataproc on Kubernetes

11 Sep 2019

Google Cloud is trialling alpha availability of a new platform for data scientists and engineers through Kubernetes.

Cloud Dataproc on Kubernetes combines open source, machine learning and cloud to help modernise big data resource management.

The alpha availability will first start with workloads on Apache Spark, with more environments to come.

According to Google Cloud product managers Christopher Crosbie and James Malone, Google Cloud Dataproc can provide open source data analytic processing for those who need to process data and train models at scale, faster.

However, as enterprise infrastructure becomes increasingly hybrid in nature, machines can sit idle, single workload clusters continue to sprawl, and open source software and libraries continue to become outdated and incompatible with your stack,” they explain.

“It’s critical that Cloud Dataproc continues to empower data professionals to focus more on workloads than infrastructure by combining the best of cloud and open source.”

The platform will include key benefits such as faster workloads, unified resource management, job isolation, collaboration, and expertise sharing.

Unified resource management will allow data scientists to work with a central view that spans both Kubernetes and YARN cluster management systems.

“Kubernetes has flipped the big data and machine learning open source software (OSS) world on its head, since it gives data scientists and data engineers a way to unify resource management, isolate jobs, and build resilient infrastructures across any environment.”

More resilient infrastructure: A self-healing GKE environment can support the smooth operation of mission critical ETL and machine learning jobs on Spark.

“Data scientists and data engineers don’t have to worry about sizing and building clusters, manipulating Docker files, or messing around with Kubernetes networking configurations. It just works. With leading support from the team that built Kubernetes, enterprises have access to the skills they need to close any Kubernetes skills gap on their team.”

Less time and resource on infrastructure, more on workloads – the development of new applications and models faster at scale

Isolate jobs to accelerate analytics life cycles – users can package up entire jobs in standalone containers to allow for testing, upgrading and patching without breaking underlying cluster.

Collaboration and expertise sharing to close the Kubernetes skills gap – new capabilities, bugs and security issues can be discussed and resolved by open source community

This is the first step in a larger journey to a container-first world. While Apache Spark is the first open source processing engine we will bring to Cloud Dataproc on Kubernetes, it won’t be the last,” comment Crosbie and Malone.

They add that Google Cloud’s data and analytics strategy has always involved open source as a core pillar.

“This alpha announcement of bringing enterprise-grade support, management, and security to Apache Spark jobs on Kubernetes is the first of many as we aim to simplify infrastructure complexities for data scientists and data engineers around the world.”

Story image
Singapore-based Red Apricot has solutions accepted by SAP App Center
“We pride ourselves on innovation, having recognized a specific need within the financial planning space, and created a solution to make digital transformation in finance and planning much smarter, faster, easier and more intuitive.”More
Story image
How 'data gravity' centres can spell trouble for enterprises
In the not-too-distant past, data was created in a much more centralised place, and users and systems had far less access to it. Now, with digital data from social, analytics, mobile, cloud, IoT and more being created with both simultaneity and omnipresence, so much information is being collected that it’s forming a ‘centre of gravity’.More
Story image
HPE completes Silver Peak buyout & strengthens edge-to-cloud portfolio
Silver Peak will now begin a new life under HPE’s Aruba business.More
Story image
Fujitsu develops Western Sydney data centre as part of Aus expansion
"Fujitsu is building on more than 20 years of data centre experience in Australia, and a strong heritage of providing high quality data centre services to government and enterprise customers."More
Story image
HPE, Schneider Electric & StorMagic launch 'Edge in a Box' micro data centre
Edge in a Box includes two HPE ProLiant servers, Schneider Electric’s 6U Wall Mount Ecostruxure Data Center, and StorMagic SvSAN software.More
Story image
AirTrunk to build hyperscale data center in Japan
AirTrunk TOK1 will be designed to support public cloud customers’ demand for growing capacity requirements in Japan.More