How we learned to upgrade Tenzor's 5,000 servers.

Nowadays, in every decent organization that develops serious software, it is customary to share the ways in which its projects were created and developed. We consider this to be an excellent trend and are ready to tell our version of the development of one of the internal projects of the SBIS company. He influences in the most serious way all her other products, and he is affectionately called “Hottabych”, for he does the magic!

Every 100 seconds, he updates any application in combat or in a test environment. We only have about 200 applications in production, and more than 1000 on test stands. The number of virtual servers on which each application is deployed is from two to several hundred. So, in order ...

Before the Big Bang

Online SBIS was released recently - in 2010. Before that, we were like all decent applications - desktop. First, they lived under DOS, then under the Windows console, then they became a full-fledged Windows application, well, just like people. The client data was either stored in the Pervasive DBMS or in a database of its own manufacture with the mysterious name Muzzle. Then the clients themselves were engaged in updating their VLSI. The most difficult in updating was conversion of base. For this operation, a special utility Jinnee was used .
')
Jinnee was a good girl. He took a new description of the base structure from the distribution, automatically calculated the difference between the old and the new structure and rolled it onto the base. It was possible to intervene in the conversion process with the help of the converter mechanism. The necessary converter worked only for the version for which it was written. The ideas embodied then in Jinnee were so good that they have survived to this day, like Jinnee himself. We could not find its analogues in the software industry, and therefore we proudly consider it unique.

Big bang is about to start

Suddenly, in 2010, SBIS became a web service under IIS and changed data stores to PostgreSQL. Of course, it became easier for customers to do this - they didn’t have a headache about the update, and we still didn’t realize what we’ve gotten into and updated our web service in the old fashioned way, as we did with the desktop application. All the same handles launched Jinnee, teaching him to convert PostgreSQL. And somehow it didn’t bother, the service was practically the same, the base was one, there were not many online clients. All efforts at that moment were aimed at improving PostgreSQL conversion using Jinnee.

But soon the web-services began to multiply, and the number of clients rapidly grew to 300 thousand. It was necessary to update already a dozen of services located on fifty servers. Manually doing this is unbearable, and updates took place at night at the weekend. Then, even during the update, clients could not work, getting a parking page with the text "Sorry, the application is temporarily unavailable, technical work is in progress, the expected unlock time is 6:00."

There are three problems:

eliminate manual work during the upgrade, and thus speed up the work itself and reduce errors due to the human factor;
minimize customer downtime during the upgrade;
give sleep to the teams.

Here you need to tell a little about our client data storage architecture, so that the rest of it becomes clear.

We store customer data in Postgres databases. 1.5 thousand clients are usually attached to one base. Inside the database, each client data is stored in a separate scheme.

This unusual storage method gives us the following buns:

you can convert data from one client independently of another;
you can easily transfer the customer data from one base to another;
The client will never be able to see other people's data.

When the client executes the request, it first gets to the dispatcher. Dispatcher plays the role of balancer and router, it is implemented on the basis of nginx. Having calculated the client's route, he throws it onto one of the servers of the business logic of the group in which the client’s base is located. Business logic, if necessary, already refers to the necessary database, addressing the client’s schema inside the database. And it's all.

Big bang starts

By the end of autumn 2012, we conceived a new service to implement the update “with a human face”. We formed a team of three people, and, having prayed, set about ...

The update system architecture was formed immediately from the following components:

update management service;
Meta-information service and configuration of cloud services;
dispatcher;
update agents;
application distribution repository.

The managing service was named Hottabych , it was also an allusion to the Jinnee utility, which could convert databases.

We had a service of meta-information and configuration of cloud services from the very beginning and functioned well, we call it "Cloud Management" . For the needs of the update, its task is to provide information where robots, web services are topologically located and what their settings are. It is he who commands the dispatchers to issue parking pages about the unavailability of services. The use of dispatchers was originally conceived as simple balancers of http requests, but then their functions became significantly wider.

Update Agents (Hottabych agents) are services on service hosts that are designed to execute commands to update them.

Distribution repository - the name speaks for itself. Distributions also existed at this point, there was only storage.

Interact this was something like this:

Hottabych receives a command to update the application to the desired version;
Contact the cloud management service for the service topology;
Downloads the distribution kit of the required version to the shared disk;
Divides all server services in half - one part will be updated while the second will work;
Sends commands to agents of the first part of servers for an update, indicating where the distribution kit is located;
Agents download the distribution package to themselves, unpack it and change the old version to a new one.
Then the managing service does the same for the rest of the servers;
When all agents have reported the end of the work, the managing service goes into a state of waiting for confirmation of the update, because it can be rolled back.

It remained to develop the service itself with the interface, the update agents, and organize the distribution repository with the layout there.

The first child ..

At the first iteration, we only needed an update without data conversion. The distribution repository decided to do on SVN - here are fools!

Structurally, the storage organization was hierarchical:

 - |-  - |-   +     |-  - |-   +

Hottabych decided to write on its own VLSI platform. This is exactly what we knew how to do, there was no doubt. There were doubts with the update agent, but in the end they also began to do it on the VLSI platform (this is C ++). A tough condition was set for the agents - they should report to Hottabych regularly and as informatively as possible about their work.

The update was divided into three main phases:

1. Preparation phase - during it, distribution files are delivered to application hosts, various preparatory actions are made that do not disrupt the operation of services.
2. The update phase - in fact, an update is performed on it, here the work of the service can sometimes be suspended.
3. Confirmation phase - minor update tasks are completed here, unnecessary files are cleaned, etc. Services at this moment are already working.

The whole process is controlled from the front end by the person in charge, and it can always be paused, continued or rolled back. For each update node in the frontend, you can see what state it is in, what is happening on it, what is happening, or maybe everything…

Big bang took place

Finally, in the spring of 2013 we were able to carry out the first update in a combat environment, then the second, the third, and everyone liked it so much that, to the joys of it, they started updating during the day during working hours. We called this type of update “easy” because it passed without any problems for everyone - both for us and for customers. It was a breakthrough!

The universe is expanding

After the first successes, we got the first problem . On test benches, distributions were loaded into the storage so often and up to 1.5 GB in size, that SVN cracked at the seams. Of course, it was not suitable for storing large binary files, and we quickly went into the storage of distributions on a linux server. It was also a so-so decision, since it was necessary to keep the ball on the linux-disk from the windows-servers, but it became much better.

Ahead of waiting for the update of services with database conversions. In this iteration, Jinnee, who was able to convert the database, was delivered by the agent to one of the servers being updated. The agent started it, commanding which database to convert.

When converting databases, unfortunately, working with them is difficult. The work of the service must be suspended. However, customers in the frontend should not be intimidated, and for them at this moment there will be a parking page. It usually reports what is happening now, what new features will be after the update, and after the end of the parking page is removed.

In the summer of 2013, we also rolled out this update option into working use. And everything would be nice, but we have some services that work not with one base, but with many, for example with a hundred. And when all this heap of databases started to be simultaneously converted, a decent load arrived at the data center storage. I had to do the conversion is not so aggressive, and not all at once. So, at the moment we will never convert all customers at once: one group of customers is usually converted, in a few days the other.

And again on the conversion of databases

Client bases were converted for a long time - several hours. It did not suit us at all. We are able to convert the data of one client in the database regardless of the data of another. Then the client instead of hours of inactivity can get a suspension of work from several seconds to several minutes. And you need to be very unhappy to run into these seconds. With this approach, some of the clients during the update will live in the database on the old version of the data and be serviced by the old logic, and the other part on the new version of the data and be served by the new logic. The client being updated should just wait.

We stopped at the next decision. Pre- frontier dispatchers are transferred to the “client update” mode. This means that for each request from the client, they turn to Redis for information about the state of the client: updated, not updated or still updated.

Hottabych lays out the information there when he decides to update the client. If the client is not updated, the dispatcher casts a request for a pool of non-updated servers; if updated, then on the pool of updated servers; and if updated, gives the client a parking page.

This option we rolled out in the "production" at the end of 2014 . He got accustomed so that he became almost the main version of the conversion. Of course, not all of our databases can be converted like this, but not all of our databases are converted for so long.

We now have more than 1 million customers, and they are located on more than 650 bases. And conversions are quite unnoticed.

And Linux came ...

Years of operation of services under IIS, led us to a sustainable idea - we do not need IIS, and neither did Windows, because, in fact, we did not use the Microsoft technology stack. And under Linux, there is less server overhead, a C ++ code is better optimized, more choice of server hardware, for example, IBM Power, etc.

And in 2015 came a major event. The team developing the core of our services set the stage for the migration to Linux , and the migration began. We used this to refuse SSH when upgrading services to Linux and ported Hottabych agent to it.

Here our other team also made a breakthrough and developed an analogue of Google Drive - SBIS Disc . We took advantage of this. They refused to store distributions on a dedicated Linux server, began to store them in VLSI Disk. The process of downloading the distribution has greatly simplified and accelerated. Now, every agent Hottabych on a team directly from the VLSI Disk cheerfully downloaded the desired distribution kit. We breathed freer, and the system administrators breathed too, but not as freely as we.

And back to the database conversion

Simple customer we minimized and that's great. But still, how to remove it altogether? Often the conversion can be carried out without stopping the base at all. For example, adding an index or some field. We called this conversion "easy." But the question is, how do you know if you can perform a “light” conversion? The same Jinnee Wizard gives us the answer. A new distribution kit and a candidate database for conversion is submitted to it, it analyzes the expected changes on them and gives recommendations on what conversion to carry out.

Alas, situations are inevitable when you need to convert the base with a stop. However, many databases mainly work on reading, and if the conversion is expected for a short time, then you can wrap all the read requests at the replica of the database at the time of the conversion, and after the conversion is finished, switch to the master. Requests to change data in this case will be rejected with an error. It turned out that this option was very suitable. Some critical services are now always converted in this mode, continuing to respond to most requests.

About patches

Often, an error is detected in the operation of the service and you need to fix it as soon as possible. Editing is usually small - one file, and the assembly of a new distribution can be quite long, and there is no need for that. We called this type of update a “patch” . The corrected file is transferred to Hottabych, it replaces it in the existing distribution, raising its version, and rolls it onto the service. Everything happens very quickly.

Ordering unordered

Very often there is no need to update the service code and the data structure of its databases. You just need to change the data once. Previously, this was done by scripts and transferred to data center administrators for execution. Those they started with pens and eyes watched the work. Scripts multiplied, on the day they could arrive from 10 to 20. Many had to be performed on several hundred bases. Signs of growing mess there.

As a result, we stopped at this decision. The developer implements the script logic in the form of a normal request code executed on the service. So he gets all the internal infrastructure of the service. Hottabych delivers the code of the service request and calls it, tracking the execution at each of the service databases. The data center administrator can only press a button in the Hottabych interface; he takes over the rest of the logic. Now it is the most popular type of update.

After the big bang

Hottabych continues to evolve, the transition to Linux has opened up new possibilities for us, the containerization of our services is next. We also plan to ensure that customers, even with aggressive conversion, do not get any downtime. Due to the network of Hottabych agents deployed to our data center servers, we received additional benefits that we did not immediately count on. For example, agents Hottabych now not only update, but also perform a reload of service processes on demand, configuration tasks, some monitoring.

Now Hottabych is included in the “farm” of about 5 thousand servers, and he copes well with this. The number of databases that are periodically converted ~ 2 thousand. All this is updated with a minimum of downtime for more than 1 million customers for a long time.

Author: Alexey Terentyev

Source: https://habr.com/ru/post/327476/

All Articles