For several months, the fronts of Mail.Ru Mail have become 64bit. Better late than never, we decided, and today I will tell why we did it, what we went through for this and how we did it.
And so it works')
For a long time, our Mail worked on 32 bits on the first Apache and Perl 5.8 running CentOS 5. The idea to switch frontend to more modern software and 64-bit architecture wandered in our minds a long time ago, just a year and a half ago only two people - one admin and one developer - for some week without sleep they raised a test server on which our bright future spun. However, in those days we had more urgent tasks, and the server was safely forgotten. Periodically, they returned to this idea, but everything happened in the mode “What if so? “Oh, something broke!”, And again everything was rolled away and put off for a long time.
Time has comeFinally it's time for a change. At some point, we realized that it could not continue like this: full support for CentOS 5 ceases in 2014, Perl 5.10 on some tasks shows a 30% increase in speed, not to mention the fact that 32-bit architecture in the 21st century is somewhat behind the desired.
In addition, after we
transferred Mail to HTTPS , the load average on the front-end servers increased, so that more efficient Perl became more relevant than ever.
Transition difficultiesFirst of all, we had to rewrite the code points that directly interact with Apache. Since Apache's ErrorLog is tied up with our monitoring system, we had to teach the new server how to log errors in the way we need. The result is a self-written Apache 2 error logging module, available by
reference .
In addition, we had to deal with dependencies: all the packages we used historically collected under the c5x32 architecture and in this form were added to the repository. In the changed realities, everything, including modules under nginx, had to be reassembled for c6x64.
The captcha generation module has also been completely rewritten.
However, banners gave us the most trouble. Our entire layout is built on slots, the contents of which, including banners, are taken from a template engine written in C and deeply integrated into Apache. The module responsible for the banners picks up the Apache headers and is already targeting them. To make it all take off under Apache 2, it was necessary not only to work hard, but also to shake the guys from the relevant department.
Binary ProtocolsIn Mail.Ru Mail, interaction often occurs via binary protocols. First of all, this communication with our data warehouse Tarantool. In addition to the database, even between our services, for example, a Perl server and a C server, the data is transmitted in binary form. This is good, fast and convenient, until it comes to changing architecture.
Every time you need to encrypt data or add modulo, the probability that the results of the operation on x32 and x64 will be different, becomes non-zero. Complicated by the fact that these differences manifest themselves only on specific data, so looking for, catching and fixing these cases is not a trivial task.
For example, the first line of code below will work on different architectures in completely different ways.
my $ crypted_userid = $ user -> {'ID'} ^ 41262125215;
getpage ('project_url_api? user_id ='. $ crypted_userid);
This difference in behavior leads to problems in completely unexpected places and even on different projects. For example, the new results of executing the code with the user’s user-led data led to the fact that, thanks to our common authorization system, changing the password by one user led to the unlocking of a completely different user from another service.
The same encryption problem was found in the Mail itself. After sending the letter, the user is sent to the URL, which, among other things, contains encrypted recipients (no one wants his email to be sent to the address bar in the clear). What happened after we started building a URL on a 64-bit architecture is easy to guess: instead of a list of recipients, a random set of characters appeared.
Of course, these problems are now solved, but their trapping took considerable time.
A total of twoThe transition itself was completed in two months. Before this happened, for about six months our Mail worked on both old fronts and new ones. We carefully monitored the behavior of these two systems: there were two separate graphs on our dashboards, on which we monitored where there were more errors and what worked faster. A funny story is connected with them - once in the middle of the night everyone was lifted up to their ears, because on the performance graphs of the new fronts the lines were higher - it turned out that they were working much slower. Then, however, it turned out that we were so optimized that the scale for the new fronts automatically changed an order of magnitude, and in fact they work 10 times faster. However, we managed to get scared.
In addition, under the conditions of the two systems, it is impossible to just simply retrieve and process the Apache requester. You have to do the following:
sub GetApacheRequest {
$ Env {mod_perl} = ~ m {mod_perl / 2}? Apache2 :: RequestUtil-> request (): Apache-> request ();
}
Packaging packages for the Post also began to be full of conditions of the form
% if 0% {? centos} == 6
...
% if 0% {? centos} == 5
...
And there are lots of places like this.
In addition to monitoring and making changes to two branches at once, we, of course, had to test new iterations twice as long, but our testers coped.
Reward for the worksSo, what did we get as a result of this painstaking and sometimes nervous work?
- Full support for CentOS 6 - new patches, current system status
- Fast regulars in Perl 5.10. Scripts that perform analysis and parsing fly even faster.
- Apache 2 picks up the new config and scripts without restarting. Layout of the code and configs does not lead to the 500th error (theoretically, the first Apache was also able to, but getting him to do it normally without disconnecting the front from the load is a fiction task)
- Total refactoring. Switching to new software is a great reason to get rid of unnecessary dependencies, unnecessary entities and unused modules.
- Using Puppet . We decided to go for a walk, we decided, and at the same time switched to Puppet. Now the layout of new features and hotfix deployment has become much easier.
One would expect that the transition from 32 to 64 bits would have a detrimental effect on the memory consumption of Apache, which is already trying to eat everything that they give. The amount of memory allocated for one process, of course, has grown, and there is no getting away from it. However, everything began to work faster, so fewer processes cope with the tasks, so that, in general, the cost of memory has not increased. Around profit.
Perl 5.10, by the way, gives us an additional advantage: the simplicity of the transition to 5.16 compared to the painful transition from 5.8.8. So wait for the new Perl in our Mail.
If you have any questions, I suggest to discuss them in the comments.
Ilya Zaretsky,
Team Leader Backend Mail Development