Google developers have
announced the launch of a new project Kubeflow. The project simplifies work with machine learning by providing the necessary tools for scaling and tuning the system in the Kubernetes environment. The article will tell:
- about Kubeflow components;
- how to get started with the solution;
- about the prospects of the project.
/ photo by Michael Hicks CCIn 2017, two things happened. First: Kubernetes has
established itself as a standard for working with a cluster of containers. This is confirmed by
the Portworx
survey of 2017, which was attended by 490 IT professionals from various industries: Kubernetes is used as a container orchestration tool
more often than Docker Swarm, Amazon ECS, or Azure Container. The second - machine learning, according
to Gartner, was at the peak of popularity.
')
These two factors
prompted Google to create Kubeflow - an open-source project that simplifies working with Kubernetes in MO and
takes on all the
advantages of this orchestration tool: the ability to deploy on a variety of infrastructure (from laptop to production cluster), managing loosely coupled microservices and scaling on demand.
Kubeflow components
The project code is stored in the
Github repository . There you will find the following components:
- JupyterHub is a server for creating and managing an interactive Jupyter Notebook environment. Using JupyterHub, you can share notebook files that allow you to store code, images, comments, formulas and diagrams together.
- Tensorflow Custom Resource (CRD) , which can be configured to work with CPUs or graphics processors and adjusted to the size of the cluster.
- Container for Tensorflow Serving - a flexible system for scanning machine learning models in a production environment. The component integrates with Tensorflow-models out of the box, but is also suitable for other models and data.
Philip Winder, a software developer at Container Solutions,
notes that Kubeflow is a hybrid of JupyterHub and Tensorflow. In it, Tensorflow serves as a universal graph computing mechanism that allows programmers to abstract from hardware and use the same code to work with CPUs and GPUs. That is why the same model can be deployed both on a laptop and in a cloud cluster.
Getting started with Kubeflow
For a quick start you will need:
- ksonnet version 0.8.0 and later;
- Kubernetes version 1.8 (in our corporate blog, you can find a guide to setting it up).
To
get started with Kubeflow, you need to run the following commands:
These commands set up JupyterHub and Custom Resource for working with training samples in TensorFlow. In addition, ksonnet packages provide prototypes for configuring TensorFlow tasks and deploying TensorFlow models.
Detailed instructions for using Kubeflow can be found in the
official manual .
Here you can read the instructions from the developers, and
here - try out Kubeflow in the browser right now.
By the way, Michael Hausenblas, a developer from Red Hat and co-author of the book
Kubernetes Cookbook , created a
website to help those who work with machine learning in Kubernetes. There you can find an overview of the main tools and tutorials, including for Kubeflow.
What's next
The Kubeflow project has already been
supported by many industry leaders: CaiCloud, Red Hat, Canonical, Weaveworks, Container Solutions and others.
Developers David Aronchick and Jeremy Lewi, who are working at Google on Kubeflow, claim that this is just the beginning. In the future, the team plans to attract more partners, popularize the idea and improve the project. You can follow Kubeflow on the
Slack channel by subscribing to an
email newsletter and
Twitter .
PS Three more materials from the First Corporate IaaS blog: