Stateful backups in Kubernetes

So, as everyone knows, most recently, DevOpsConfRussia2018 took place on October 1-2 October in Moscow in Infospace . For those who do not vkurse, DevOpsConf - a professional conference on the integration of development processes, testing and operation.

Our company also took part in this conference. We were its partners, representing the company at our stand, and also conducted a small meeting. By the way, this was our first participation in this kind of activity. The first conference, the first mitap, the first experience.

What are we talking about? Mitap was on “Backups in Kubernetes”.

Most likely, having heard this name, many will say: “Why backup in Kubernetes? He doesn't need to be backed up, he's Stateless. ”

Introduction ...

Let's start with a little background. Why did it become necessary to highlight this topic and why it is needed.

In 2016, we became acquainted with such technology as Kubernetes and began to actively apply it to our projects. Of course, these are mainly microservice architecture projects, and this in turn entails the use of a large number of various software.

With the very first project, where we used Kubernetes, we had a question about how to back up stateful services located there, which sometimes for one reason or another get into k8s.

We began to study and search for existing practices to solve this problem. Communicate with our colleagues and comrades: "And how is this process carried out and built from them?"

After talking, we realized that for everyone this happens by different methods, means and with a large number of crutches. At the same time, we did not follow any single approach even within the framework of one project .

Why is this so important? Since our company serves projects built on the basis of k8s, we just needed to develop a structured methodology for solving this problem.

Imagine you are working with one specific project in Kubera. It contains some stateful services and you need to back up their data. In principle, you can do with a couple of crutches and forget about it. But what if you already have two projects on k8s? And the second project uses completely different services in its work. And if there are already five projects? Ten? Or more than twenty?

Of course, putting crutches on is difficult and inconvenient. We need some kind of unified approach that could be used when working with many projects in Cuba and at the same time so that the team of engineers can easily and literally in a matter of minutes make the necessary changes to the work of the backups of these projects.

Within the framework of this article, we will tell you exactly what tool and practice we use to solve this problem within our company.

What are we doing this for?

Nxs-backup what is it?

For backups, we use our own open source tool - nxs-backup. We will not go into the details of what he can. More information about him can be found at the following link .

We now turn to the actual implementation of backups in k8s. How and what exactly we did.

What is backup?

Let's look at an example of the backup of our own Redmine. In it, we will back up the MySQL database and user project files.

How do we do it?

1 CronJob == 1 Service

On normal servers and clusters on hardware, almost all backup tools are mainly run through normal cron. In k8s, we use CronJob for this purpose, that is, we create 1 CronJob for 1 service, which we will back up. All these CronJobs are located in the same namespace as the service itself.

Let's start with the MySQL database. In order to backup MySQL, we need 4 elements, as well as almost any other service:

ConfigMap (nxs-backup.conf)
ConfigMap (mysql.conf for nxs-backup)
Secret (access to the service is stored here, in this case MySQL). Usually, this element is already defined for the operation of the service and it can be reused.
CronJob (for each service its own)

Let's go in order.

nxs-backup.conf

apiVersion: v1 kind: ConfigMap metadata: name: nxs-backup-conf data: nxs-backup.conf: |- main: server_name: Nixys k8s cluster admin_mail: admins@nixys.ru client_mail: - '' mail_from: backup@nixys.ru level_message: error block_io_read: '' block_io_write: '' blkio_weight: '' general_path_to_all_tmp_dir: /var/nxs-backup cpu_shares: '' log_file: /dev/stdout jobs: !include [conf.d/*.conf]

Here we set the basic parameters transmitted to our tool, which are necessary for its operation. This is the name of the server, e-mail for notifications, restriction on resource consumption and other parameters.

Configurations can be set in j2 format, which allows the use of environment variables.

mysql.conf

 apiVersion: v1 kind: ConfigMap metadata: name: mysql-conf data: service.conf.j2: |- - job: mysql type: mysql tmp_dir: /var/nxs-backup/databases/mysql/dump_tmp sources: - connect: db_host: {{ db_host }} db_port: {{ db_port }} socket: '' db_user: {{ db_user }} db_password: {{ db_password }} target: - redmine_db gzip: yes is_slave: no extra_keys: '--opt --add-drop-database --routines --comments --create-options --quote-names --order-by-primary --hex-blob' storages: - storage: local enable: yes backup_dir: /var/nxs-backup/databases/mysql/dump store: days: 6 weeks: 4 month: 6

This file describes the backup logic for the corresponding service, in our case it is MySQL.

Here you can specify:

What is the name of Job (field: job)
Job'a type (field: type)
The temporary directory needed to collect backups (field: tmp_dir)
MySQL connection parameters (field: connect)
Database that will be backed up (field: target)
The need to stop the Slave before collecting (field: is_slave)
Additional keys for mysqldump (field: extra_keys)
Storage storage, i.e. in which storage we will store a copy (field: storage)
The directory where we will store our copies (field: backup_dir)
Storage scheme (field: store)

In our example, the storage type is set to local, that is, we collect and store backup copies locally in a certain directory of the pod being launched.

Right by analogy with this configuration file, you can set the same configuration files for Redis, PostgreSQL or any other necessary service, if our tool supports it. The fact that it supports can be found on the link given earlier.

Secret mysql

 apiVersion: v1 kind: Secret metadata: name: app-config data: db_name: "" db_host: "" db_user: "" db_password: "" secret_token: "" smtp_address: "" smtp_domain: "" smtp_ssl: "" smtp_enable_starttls_auto: "" smtp_port: "" smtp_auth_type: "" smtp_login: "" smtp_password: ""

In secret, we keep access to connect to MySQL itself and the mail server. They can be stored or in a separate secret, or use the existing, of course, if it is. There is nothing interesting. Our secret also keeps the secret_token necessary for our Redmine to work.

MySQL CronJob

 apiVersion: batch/v1beta1 kind: CronJob metadata: name: mysql spec: schedule: "00 00 * * *" jobTemplate: spec: template: spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - nxs-node5 containers: - name: mysql-backup image: nixyslab/nxs-backup:latest env: - name: DB_HOST valueFrom: secretKeyRef: name: app-config key: db_host - name: DB_PORT value: '3306' - name: DB_USER valueFrom: secretKeyRef: name: app-config key: db_user - name: DB_PASSWORD valueFrom: secretKeyRef: name: app-config key: db_password - name: SMTP_MAILHUB_ADDR valueFrom: secretKeyRef: name: app-config key: smtp_address - name: SMTP_MAILHUB_PORT valueFrom: secretKeyRef: name: app-config key: smtp_port - name: SMTP_USE_TLS value: 'YES' - name: SMTP_AUTH_USER valueFrom: secretKeyRef: name: app-config key: smtp_login - name: SMTP_AUTH_PASS valueFrom: secretKeyRef: name: app-config key: smtp_password - name: SMTP_FROM_LINE_OVERRIDE value: 'NO' volumeMounts: - name: mysql-conf mountPath: /usr/share/nxs-backup/service.conf.j2 subPath: service.conf.j2 - name: nxs-backup-conf mountPath: /etc/nxs-backup/nxs-backup.conf subPath: nxs-backup.conf - name: backup-dir mountPath: /var/nxs-backup imagePullPolicy: Always volumes: - name: mysql-conf configMap: name: mysql-conf items: - key: service.conf.j2 path: service.conf.j2 - name: nxs-backup-conf configMap: name: nxs-backup-conf items: - key: nxs-backup.conf path: nxs-backup.conf - name: backup-dir hostPath: path: /var/backups/k8s type: Directory restartPolicy: OnFailure

Perhaps this is the most interesting element. Firstly, in order to create the correct CronJob, it is necessary to determine where the collected backups will be stored.

We have dedicated server for this with the necessary amount of resources. In the example, a separate cluster node, nxs-node5, is reserved for collecting backups. Restriction of CronJob launch on the nodes we need is set by the nodeAffinity directive.

When CronJob is launched, the corresponding directory is connected to it via the hostPath from the host system, which is just used for storing backup copies.

Next, ConfigMapes that contain the configuration for nxs-backup are connected to a specific CronJob, namely, the files nxs-backup.conf and mysql.conf, which we just talked about above.

Then, all the necessary environment variables are set, which are defined directly in the manifest or are pulled from the Secret.

So, the variables are transferred to the container and through docker-entrypoint.sh are substituted in ConfigMaps in the right places to the right values. For MySQL, this is db_host, db_user, db_password. In this case, the port is transmitted simply as a value in the CronJob manifest, since it does not carry any valuable information.

Well, with MySQL everything seems to be clear. And now let's see what is needed for backup of the Redmine application files.

desc_files.conf

 apiVersion: v1 kind: ConfigMap metadata: name: desc-files-conf data: service.conf.j2: |- - job: desc-files type: desc_files tmp_dir: /var/nxs-backup/files/desc/dump_tmp sources: - target: - /var/www/files gzip: yes storages: - storage: local enable: yes backup_dir: /var/nxs-backup/files/desc/dump store: days: 6 weeks: 4 month: 6

This is a configuration file that describes the backup logic for files. Here, too, there is nothing unusual, all the same parameters are set as those of MySQL, with the exception of the data for authorization, because they simply do not exist. Although they can be, if the protocols for data transfer will be involved: ssh, ftp, webdav, s3 and others. We will consider this option a little later.

CronJob desc_files

 apiVersion: batch/v1beta1 kind: CronJob metadata: name: desc-files spec: schedule: "00 00 * * *" jobTemplate: spec: template: spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - nxs-node5 containers: - name: desc-files-backup image: nixyslab/nxs-backup:latest env: - name: SMTP_MAILHUB_ADDR valueFrom: secretKeyRef: name: app-config key: smtp_address - name: SMTP_MAILHUB_PORT valueFrom: secretKeyRef: name: app-config key: smtp_port - name: SMTP_USE_TLS value: 'YES' - name: SMTP_AUTH_USER valueFrom: secretKeyRef: name: app-config key: smtp_login - name: SMTP_AUTH_PASS valueFrom: secretKeyRef: name: app-config key: smtp_password - name: SMTP_FROM_LINE_OVERRIDE value: 'NO' volumeMounts: - name: desc-files-conf mountPath: /usr/share/nxs-backup/service.conf.j2 subPath: service.conf.j2 - name: nxs-backup-conf mountPath: /etc/nxs-backup/nxs-backup.conf subPath: nxs-backup.conf - name: target-dir mountPath: /var/www/files - name: backup-dir mountPath: /var/nxs-backup imagePullPolicy: Always volumes: - name: desc-files-conf configMap: name: desc-files-conf items: - key: service.conf.j2 path: service.conf.j2 - name: nxs-backup-conf configMap: name: nxs-backup-conf items: - key: nxs-backup.conf path: nxs-backup.conf - name: backup-dir hostPath: path: /var/backups/k8s type: Directory - name: target-dir persistentVolumeClaim: claimName: redmine-app-files restartPolicy: OnFailure

Also nothing new regarding MySQL. But here one additional PV (target-dir) is mounted, just which we will back up - / var / www / files. Otherwise, we still store copies locally on the node we need, which CronJob is assigned to.

Total

For each service we want to back up, we create a separate CronJob with all the necessary companion elements: ConfigMaps and Secrets. By analogy with the considered examples, we can back up any similar service in the cluster.

I think, on the basis of these two examples, everybody got some idea how exactly we back up Stateful services in Cuba. I think it makes no sense to analyze in detail the same examples for other services, because basically they are all similar to each other and have minor differences.

Actually, this is what we wanted to achieve, namely, some kind of unified approach in building the backup process. And so that this approach could be applied to a large number of different projects based on k8s.

Where is it stored?

In all the examples discussed above, we store copies in the local directory of the node on which the container is running. But no one bothers to connect Persistent Volume as a working external storage and collect copies there. Or you can only synchronize them to a remote repository using the desired protocol, without saving locally. That is a lot of variations. First compile locally, then synchronize. Or collect and store only in a remote repository, etc. The configuration is quite flexible.

mysql.conf + s3

Below is an example of the MySQL backup configuration file, where copies are stored locally on the node where CronJob is running, and also synchronized to s3.

 apiVersion: v1 kind: ConfigMap metadata: name: mysql-conf data: service.conf.j2: |- - job: mysql type: mysql tmp_dir: /var/nxs-backup/databases/mysql/dump_tmp sources: - connect: db_host: {{ db_host }} db_port: {{ db_port }} socket: '' db_user: {{ db_user }} db_password: {{ db_password }} target: - redmine_db gzip: yes is_slave: no extra_keys: ' --opt --add-drop-database --routines --comments --create-options --quote-names --order-by-primary --hex-blob' storages: - storage: local enable: yes backup_dir: /var/nxs-backup/databases/mysql/dump store: days: 6 weeks: 4 month: 6 - storage: s3 enable: yes backup_dir: /nxs-backup/databases/mysql/dump bucket_name: {{ bucket_name }} access_key_id: {{ access_key_id }} secret_access_key: {{ secret_access_key }} s3fs_opts: {{ s3fs_opts }} store: days: 2 weeks: 1 month: 6

Ie, if it is not enough to store copies locally, you can synchronize them to any remote storage using the appropriate protocol. Storage number can be any.

But in this case, you still need to make some additional changes, namely:

Connect the appropriate ConfigMap with the content required for authorization with AWS S3, in j2 format
Create an appropriate Secret to store access authorization
Set the desired environment variables taken from Secret above
Adjust docker-entrypoint.sh to replace corresponding variables in ConfigMap
Rebuild Docker image by adding utilities for working with AWS S3

So far this process is far from perfect, but we are working on it. Therefore, in the near future, we will add to nxs-backup the ability to define parameters in the configuration file using environment variables, which will greatly simplify the work with the entrypoint file and minimize the time costs of adding support for backup of new services.

Conclusion

On this, probably, everything.

Using the approach that has just been discussed, first of all, it allows you to structure and back up the Stateful project services to k8s in a structured and pattern-based manner. That is, this is a ready-made solution, and most importantly a practice that can be applied in your projects, without wasting time and energy on finding and refining existing open source solutions.

Source: https://habr.com/ru/post/426543/

All Articles