We recently created a distributed cron job planning system based on Kubernetes , an exciting new container cluster management platform. Now Kubernetes takes leading positions and offers many interesting solutions. One of its main advantages is that engineers do not need to know which machines have their applications running.
Distributed systems are truly complex, and managing their services is one of the biggest challenges facing operational teams. Implementing new software in production and learning how to reliably manage it is a task that should be taken seriously. To understand why training with Kubernetes is important (and why it is difficult!), We suggest reading about the fantastic one-hour switching caused by an error in Kubernetes.
This article explains why we decided to build architecture on Kubernetes. We will describe how Kubernetes was integrated into the existing infrastructure, give an approach to building (and improving) confidence in the Kubernetes cluster reliability, and consider the abstractions that we implemented over Kubernetes.
Kubernetes is a distributed system for planning the work of programs in a cluster. You can order Kubernetes to run five copies of the program, and he will dynamically schedule their deployment on the worker nodes. Containers are deployed automatically, which improves resource utilization and saves money. Powerful deployment primitives (Deployment Primitives) allow you to gradually roll out new code, and Security Contexts and Network Policies securely launch different projects in the same cluster.
Kubernetes has many built-in scheduling capabilities. Scheduling HTTP services (long-running), daemonsets (daemonsets) that run on each node of the cluster, cron-tasks that run every hour, etc. If you want to learn more, Kelsey Hightower gave several excellent explanations: Kubernetes for sysadmins and healthz: There is also a great Slack community.
Each infrastructure project (hopefully!) Starts with business needs, and our goal was to increase the reliability and security of the existing distributed cron-tasks system. Our requirements:
If you are considering Kubernetes, keep in mind: do not use Kubernetes just because other companies use it . Creating a reliable cluster takes an enormous amount of time, and the business example of its use is not always obvious. Invest your time in a reasonable way.
When it comes to operating services, the word “reliable” does not make sense by itself. To talk about reliability, you first need to set SLO (purpose of service level).
We had three goals:
Our basic approach to building the first Kubernetes cluster was to create a cluster from scratch, without using tools like kubeadm or kops , using Kubernetes The Hard Way as a reference. We deployed the cluster using Puppet. Building a cluster from scratch was a good solution for two reasons: we were able to deeply integrate Kubernetes into our architecture and gained a deep understanding of how its internal components work.
Building from scratch will allow Kubernetes to integrate into existing infrastructure.
We wanted seamless integration with existing systems for logging, certificate management, secrets, network security, monitoring, AWS instance management, deployment, database proxy servers, internal DNS servers, configuration management, etc. Integration of all these systems sometimes required a little creativity, but in general it was easier to integrate than to force kubeadm / kops to do what we wanted.
We already trust existing systems and know how to manage them, so we would like to continue to use them in the new Kubernetes cluster. For example, secure certificate management is a very complex issue, and we already have a way to issue and manage certificates. We were able to avoid creating a new CA only for Kubernetes with proper integration.
We had to understand how the set parameters influenced the Kubernetes setting. For example, there are more than a dozen certificate / certificate authority settings used for authentication. Understanding the operation of these parameters has made it easier to debug the installation when we are faced with authentication problems.
At the beginning of working with Kubernetes, none of the team members had previously worked with him (with the exception of some cases for toy projects). What do you think of such statements: “none of us have ever used Kubernetes”, “we are sure that Kubernetes is ready to work in production”?
We asked about the experience of working with Kubernetes for several employees of other companies. They all used Kubernetes differently or in different environments (to run HTTP services, on bare hardware, on Google Kubernetes Engine , etc.).
When it comes to a large and complex system, such as Kubernetes, it is important to think seriously about use cases. You need to conduct experiments, build confidence in your environment and make your own decisions. For example, you should not read this article and decide: “Well, Stripe successfully uses Kubernetes, so we will use it!”
Here is what we learned from employees of companies working with Kubernetes clusters:
In our plans there was a very strong dependence on one of the components of Kubernetes, the cronjob controller. This component was alpha at the time, which made us uneasy. We checked his work in the test cluster, but could not say whether it would work in production?
Fortunately, all the basic functions of the cronjob controller are only 400 lines on Go. Reading the source code quickly showed that:
go wait.Until(jm.syncAll, 10 * time.Second, stopCh)
.syncAll
function retrieves all cron jobs from the Kubernetes API, iterates through this list, determines which tasks should be performed next time, and then starts these tasks.Before creating the cluster, we did some load testing. We didn’t worry about how many nodes Kubernetes could manage (in our plans it was about ~ 20 nodes), but really wanted some nodes to process as many cron tasks as possible (about 50 per minute).
We conducted a test on a cluster of 3 nodes, where we created 1000 cron jobs, each of which was performed once a minute. Each of these tasks just ran bash -c 'echo hello world'
. We chose simple tasks because we wanted to test the possibilities of planning and orchestrating the cluster, and its not the total computational capacity.
Our test cluster could not process 1000 cron jobs per minute. We noticed that each node would run no more than one item per second, and the cluster could run 200 cron jobs per minute without problems. Since we wanted to run only about 50 cron jobs per minute, we decided that these restrictions are not blocking and that we could deal with them later if necessary. Forward!
One of the most important tasks you need to perform when setting up Kubernetes is running etcd. Etcd is the heart of your Kubernetes cluster. It stores data about all the events in your cluster. All Kubernetes components, except for etcd, are stateless. If etcd is not running, you will not be able to make changes to your cluster (although the existing services will continue to work!).
This diagram shows exactly how etcd plays the role of the “heart” of your Kubernetes cluster. The API server is the final stateless point before etcd, each cluster component talks to etcd through the API server.
During the work it is necessary to consider two important points:
We set a goal to transfer our cron-tasks to Kubernetes without interruptions. The secret to the success of successful production migrations is not to avoid mistakes (this is impossible), but to design migration so as to reduce the consequences of mistakes.
Fortunately, we have many different cron tasks for migrating to a new cluster. There were some low priority tasks that we could carry over with little downtime.
Before starting the migration, we created an easy-to-use tool that, if necessary, would allow us to move tasks back and forth between the old and new systems in less than 5 minutes. This simple tool has greatly reduced the impact of errors. If, during a move, for example, a dependency that we did not plan would emerge, no harm! We could just move the task back, fix the problem and try again later.
Here is the general migration strategy we used:
At the beginning of the project, a rule was established: if Kubernetes does something strange or unexpected, we must investigate, find out the reasons and make corrections.
The investigation of each question takes a long time, but it is very important. If we do not pay attention to mistakes, then when working in a production environment, we will definitely encounter problems.
After adopting this approach, we discovered (and were able to correct!) Several errors in Kubernetes.
Here are a few issues found during the research:
--pod-eviction-timeout
), creating network partitions, an amazing --pod-eviction-timeout
happened. It is always better to detect these surprises when testing, rather than at 3 am in production.We used to discuss the exercises on the game day in Stripe, and we still do them. The idea is to come up with situations that you expect to happen in production (for example, the Kubernetes API server crashes). Then you need to deliberately reproduce these situations in production (during the working day with a warning) so that you can handle them.
Performing exercises to the cluster often revealed problems in monitoring or configuration files. We were glad to discover (and control!) These problems at an early stage, and not suddenly see after six months.
Here are some exercises from the day of the game that we used:
Let's take a quick look at how we made the Kubernetes based system easy to use.
Our initial goal was to develop a cron job system that our team could easily maintain. We became confident in choosing Kubernetes, then it was necessary to simplify the configuration and addition of tasks in cron for fellow engineers. We have developed a simple YAML configuration format so that users do not need to understand the internal functions of Kubernetes to use the system. The format we developed is:
name: job-name-here kubernetes: schedule: '15 */2 * * *' command: - ruby - "/path/to/script.rb" resources: requests: cpu: 0.1 memory: 128M limits: memory: 1024M
We didn’t do anything interesting here, just wrote a simple program to take this format as a basis and get the configuration in cronjob in the Kubernetes format, which we use with kubectl
.
We also wrote tests to ensure that the names of tasks are not too long (the names of cron tasks can not exceed 52 characters) and that all names are unique. Currently, we do not use cgroups to limit the amount of memory on most cron tasks, but we plan to address this issue in the future.
The simple format of the configuration file was easy to use, and since we automatically generated job definitions for Chronos and Kubernetes from the same source description, moving a job between any system was very simple. This was a key part of the smooth operation of gradual migration.
Monitoring the internal state of the Kubernetes cluster turned out to be easier than expected. We use the kube-state-metrics indicator package for monitoring and the small program Go veneur-prometheus to collect metrics in Prometheus. The kube-state-metrics indicators publish them as statsd indicators for our monitoring system.
For example, here is a chart of the number of waiting containers in our cluster for the last hour. Waiting means that they are waiting for the work node to be started. You can see that at 11 am there was a surge. This is because many of our cron jobs work at the 0th minute of the hour.
We also have a monitor that checks that there are no containers left in the waiting state. We make sure that each unit starts working on the node within 5 minutes, otherwise we will receive a warning.
The process of setting up a Kubernetes cluster and transferring our cron-tasks to a new cluster took us five months with the participation of three full-time engineers.
We invested in the study of Kubernetes, because we expect that we will be able to use Kubernetes more widely on Stripe.
Here are some principles that apply when working with Kubernetes (or any other complex distributed system):
Original: Learning to operate Kubernetes reliably .
Source: https://habr.com/ru/post/347014/
All Articles