Kubeflow: a new project for working with machine learning in Kubernetes

Google developers have announced the launch of a new project Kubeflow. The project simplifies work with machine learning by providing the necessary tools for scaling and tuning the system in the Kubernetes environment. The article will tell:

about Kubeflow components;
how to get started with the solution;
about the prospects of the project.

/ photo by Michael Hicks CC

In 2017, two things happened. First: Kubernetes has established itself as a standard for working with a cluster of containers. This is confirmed by the Portworx survey of 2017, which was attended by 490 IT professionals from various industries: Kubernetes is used as a container orchestration tool more often than Docker Swarm, Amazon ECS, or Azure Container. The second - machine learning, according to Gartner, was at the peak of popularity.
')
These two factors prompted Google to create Kubeflow - an open-source project that simplifies working with Kubernetes in MO and takes on all the advantages of this orchestration tool: the ability to deploy on a variety of infrastructure (from laptop to production cluster), managing loosely coupled microservices and scaling on demand.

Kubeflow components

The project code is stored in the Github repository . There you will find the following components:

JupyterHub is a server for creating and managing an interactive Jupyter Notebook environment. Using JupyterHub, you can share notebook files that allow you to store code, images, comments, formulas and diagrams together.
Tensorflow Custom Resource (CRD) , which can be configured to work with CPUs or graphics processors and adjusted to the size of the cluster.
Container for Tensorflow Serving - a flexible system for scanning machine learning models in a production environment. The component integrates with Tensorflow-models out of the box, but is also suitable for other models and data.

Philip Winder, a software developer at Container Solutions, notes that Kubeflow is a hybrid of JupyterHub and Tensorflow. In it, Tensorflow serves as a universal graph computing mechanism that allows programmers to abstract from hardware and use the same code to work with CPUs and GPUs. That is why the same model can be deployed both on a laptop and in a cloud cluster.

Getting started with Kubeflow

For a quick start you will need:

ksonnet version 0.8.0 and later;
Kubernetes version 1.8 (in our corporate blog, you can find a guide to setting it up).

To get started with Kubeflow, you need to run the following commands:

#   ksonnet APP APP_NAME=my-kubeflow ks init ${APP_NAME} cd ${APP_NAME} #   Kubeflow ks registry add kubeflow github.com/google/kubeflow/tree/master/kubeflow ks pkg install kubeflow/core ks pkg install kubeflow/tf-serving ks pkg install kubeflow/tf-job #  Kubeflow ks generate core kubeflow-core --name=kubeflow-core --namespace=${NAMESPACE} ks apply default -c kubeflow-core

These commands set up JupyterHub and Custom Resource for working with training samples in TensorFlow. In addition, ksonnet packages provide prototypes for configuring TensorFlow tasks and deploying TensorFlow models.

Detailed instructions for using Kubeflow can be found in the official manual . Here you can read the instructions from the developers, and here - try out Kubeflow in the browser right now.

By the way, Michael Hausenblas, a developer from Red Hat and co-author of the book Kubernetes Cookbook , created a website to help those who work with machine learning in Kubernetes. There you can find an overview of the main tools and tutorials, including for Kubeflow.

What's next

The Kubeflow project has already been supported by many industry leaders: CaiCloud, Red Hat, Canonical, Weaveworks, Container Solutions and others.

Developers David Aronchick and Jeremy Lewi, who are working at Google on Kubeflow, claim that this is just the beginning. In the future, the team plans to attract more partners, popularize the idea and improve the project. You can follow Kubeflow on the Slack channel by subscribing to an email newsletter and Twitter .

PS Three more materials from the First Corporate IaaS blog:

Source: https://habr.com/ru/post/347042/

All Articles

Kubeflow: a new project for working with machine learning in Kubernetes

Kubeflow components

Getting started with Kubeflow

What's next

More articles: