📜 ⬆️ ⬇️

How the infrastructure of Yandex. Mail has grown over 13 years

In the second decade of the 21st century, mail is not only correspondence between people. Users expect that the mail will help them solve daily tasks, save time, prompt the missing information. We are constantly adding new strokes to Yandex.Mail that make the lives of users a little easier. Today we would like together with you to re-follow the path that Yandex.Mail has done in 13 years, focusing on the development of the architecture and infrastructure of the service.



Now, few people remember that the very first version of Yandex.Mail was written in PHP, and the letters were stored directly in a relational database next to the meta-information. In the not-so-distant 2000, the entire postal service fit into a dozen servers. The servers themselves were fully serviced manually: from the configuration of the disks to the installation of the operating system, there was no automation.

2002


The first thing I had to give up was PHP. At that time, he often interfered with, than he helped to develop.


')
There were a lot of first-class Perl specialists around, there was already a huge number of libraries under Perl itself, and we rewrote mail on the Apache + Perl bundle. It became faster and easier.

2005 year


A little more time passed, and the existing Apache multiprocess model ceased to cope with the load. A separate problem was created by the Perl interpreter and the peculiarities of its connection to the meta-information repository. As a result, after a series of experiments, including the beta version of the bad Apache 2, we wrote our own modular application server called Baida. It can use both the transaction and event-loop query processing model, is very convenient for connecting extensions, including language interpreters - for example, JS. The server, like most of the Yandex.Mail code, is written in C ++. In many ways, it is a library for network sharing and work with threads, which was needed in order to ensure the work of the service with increasing loads.

Real-time monitoring of its condition is built into this server to facilitate its operation. In addition, we completely abandoned Perl. Instead, a separate Baida module was created - the Web Mail Interface, which contained within itself an XScript interpreter: an XML parser and XSLT transformation engine. Another feature is the connection pool with relational meta-storage and effective management of this pool.



2006


It became clear that storing the bodies of letters in a relational repository with growing data volumes will not work. So there was an order for a specialized store of letters, which was called Mulca. Technology for eight years of development has undergone not so much architectural changes, although the repository has grown many times in size. Now, besides Mail, it is used to store files in Yandex.Disk.

Since the launch of a separate storage, we have experimented with the configuration of the disk subsystem on the servers: we tried different RAID levels and merged the devices into large partitions. The RAID5 configuration lasted the longest, but it had drawbacks that needed to be fixed: RAID5 required the computation of checksums, as well as re-reading the block when it was being recorded, which lowers the write speed. In case you need to rebuild the array, its performance also degrades for a long time.



Then we began to experiment with SATA disks instead of SCSI . It turned out that non-RAID5 SATA drives provide approximately the same data access speed as SCSI collected in RAID5 in our configuration. It was necessary either to optimize the configuration, or to come up with a solution to ensure fault tolerance.

2007


We decided to completely abandon the RAID arrays on Mulca, and provide fault tolerance at the storage level, rather than the disk subsystem of a separate server. Some time later, we replaced the high-capacity SCSI disks of small capacity with large SATA disks. Moving to separate file systems and discarding redundancy at the disk level gave a good saving of resources in place. Coupled with the transition to SATA, the commissioning of new storage volumes has noticeably accelerated. In the event of a disk failure, we simply copied the data from the second copy in another data center. At the same time, our famous postal wall first appeared.

image

The development of this dashboard was initiated by system administrators and served to bring together the perception of the situation by exploitation, development and managers. We singled out the most important numbers and hung them in a prominent place so that all involved in the product could see how it feels. By the way, the appearance of the wall has not changed much over the years. The tradition of hanging panels and projectors with important information went from this wall all over Yandex.

Traditional POP3 evolved over time: from the Baida module it became an independent demon and learned to read data from a repository of the meta-storage. At about the same time, it became clear that we needed the IMAP protocol, and using the experience gained in developing POP3, we started to write it from scratch. The current version of our IMAP is the fourth in a row, and we do not plan to stop there.

For the delivery of mail, the zMailer SMTP server was used , and we began to feel its limitations: in our architecture, it didn’t accept incoming connections, since for each incoming connection it convened a new process and very quickly degraded as the queue grew. We put our own patches on him, but that didn't help much in solving problems. It became clear that we need another technology. After much discussion, we decided to switch to Postfix , it works much better with large queues. It was necessary, of course, to rewrite the delivery agent from Postfix to our storage.

2008


We have set ourselves the task of ensuring the speed of letter delivery. At that time, an average letter was delivered in a few seconds, and such a delivery time was considered good. We also needed mail to be comparable in speed with instant messaging systems.

First, we wrote a new delivery agent - Fastsrv, and embedded it first in zMailer, and then in Postfix . Fastsrv has virtually no queue and keeps a constant connection to the meta base. It was installed on mail receiving clusters. Any incoming letter was tried to be instantly put to the user in the box, and only if it did not work out, they were thrown back into the delivery queue. So most of the letters began to be delivered almost instantly. The queues for delivery from the usual picture on the charts became a symbol of the problem, and we reflected this in the monitoring.

Additionally, we have divided the queues for sending and receiving letters, so that in case of any problems with one part, to maintain the serviceability of the service.

Now it's time to replace Postfix on receiving mail. We called our own SMTP server NwSMTP . It was conceived as an event-based server capable of handling an unlimited number of connections. Similarly, the well-known web-server NGINX is arranged, but at that time there was no SMTP-proxy support in it. So we started receiving letters almost instantly. The development of NwSMTP was quite interesting: rollouts took place very often - sometimes according to the version per hour. Immediately after building a new version, they rolled it out onto one production machine and watched what was going on. Such efficiency allowed the product to be released very quickly. From here came the name NwSMTP, this is an abbreviation for Next Week SMTP: almost any task in it goes through the full development cycle in seven days and to the question "When will you do this?" The server development team answers: "Next week."

In the middle of 2008, when our Jabber server was launched, we made a real-time notification of new letters as an experiment. In order not to slow down the billing system, the notifications worked over the UDP protocol, and inside contained XML data convenient for turning into an XMPP packet. Today, jokes about “XML over UDP” make us smile, but at that time it worked, albeit with great limitations and loss of some notifications.

year 2009


By this time, several interfaces worked at the same time in Yandex.Mail. Neo was not interactive: the HTML page was generated inside WMI. Modern contained an interactive, but was built on the same architecture.

Yandex.Mail Interface in 2009

We decided to come up with an interface that loads only the most necessary - what has changed right now. After a series of experiments, we managed to prepare a platform and prove that such a service could work. This is how the Daria interface was born, it can be seen in the Mail now.



With the transition to dynamics, the query structure has changed greatly. By this time, the nginx server had developed enough and we decided to put it in front of Baida for the return of static in the Mail (by this time it was already used, for example, in the People). Simultaneously with Daria, we experimented with the launch of interpreted languages ​​on the server side. We stopped on Javascript mainly because of convenience, since the same programming language is used on the client and the server. Under the Baida server they wrote a module with TraceMonkey , it was unfortunately impossible to integrate a V8 . It was already popular, but at that time the V8 could not work with the third-party programming model (in recent versions of the V8, this flaw was fixed).

Nginx allowed us to handle different URLs in different ways, and we split the actual logic, the distribution of attachments, and also rendered the old interfaces to separate clusters. This greatly simplified the deployment of updates and maintenance, and also allowed the decomposition of the complete system into separate components. The main methods of mail operation have become independent of secondary ones, the whole system has become more stable.

This year we made threads, which certainly had a strong effect on the structure of the database and the load.

Traditionally, the size of the box was limited in size, and Yandex.Mail was no exception. But we decided that it was time to lift this restriction. This decision has changed our rules. If earlier we could calculate the load and the size of the storage, then with an infinite box everything is a little more complicated. We couldn’t tell at each moment of time how much data we needed to store. At this time, we added approximately one terabyte of data per day. For comparison, today we save 120 TB per day.

Separate story about the document viewer

Somehow we wanted to check how popular the document viewing service would be. It was possible to very quickly develop a prototype in python and raise the OpenOffice farm, which converted documents into HTML. Before the service, they installed nginx for protection against surges in load, added a module for accessing the repository for documents and put them into operation. In this form, the prototype existed for several years, overgrown with monitoring, optimizations and secondary features. Now the second version of the browser is being operated, which, incidentally, works on approximately the same technologies.

2010


This year we completely redesigned the search by mail. The implementation that existed at that time did not suit us both with the speed of work and the quality of the search result. As a result of refining the search, the system turned out to be based on not only the actual search for letters, serving tens of millions of queries per day, but also some other functions of Yandex.Mail.

We are increasingly being asked to do so that our mail service can be connected to our own domain. This is how Mail for Domains service appeared, which we already wrote about . It is remarkable that it was developed and launched on the basis of already existing components with minimal modifications. This is a fully integrated service that collects information in pieces from different places in order to show it to the domain administrator.

Then in 2010 we did a great job to translate internal Yandex correspondence (our colleagues write a lot of letters) on Yandex.Mail technology. It works in almost the same way as Mail for Domains, but is integrated with other internal services of the company. Almost all new functions first run in corporate e-mail, are debugged, refined functionally, and only after that they are available to all users.

2011


We actively began offering our users to use HTTPS to protect the transmitted data, and then turned on secure mail handling by default. This solution, on the one hand, made working with Yandex.Mail more secure, and on the other, it required us to work on setting the connection parameters. Classically, mail runs on long HTTPS connections with a high keepalive value. However, in smartphones, this connection mode is often limited or works strangely. When an HTTPS connection incurs an additional overhead for a handshake, and if the browser does not want or cannot keep the connection open, then each request will incur additional costs. On narrow channels and weak processors of phones this time becomes noticeable. Immediately after turning on HTTPS by default, we saw how the graphs of email interface load times increased, and it took some time to find the causes and solve the problem. Another problem was the display of the HTTP content of HTTPS emails, for example, images added via a tag. We built a separate service that caches images and is able to give them over HTTPS, after which we replaced the images inserted in the letters with the links of our proxy. When designing such services, it is important not to allow open redirection - the ability to open any URL from a trusted host, and adding security levels should not lead to a noticeable decrease in performance. We coped with this task.
image
At the same time, the first experiments began with non-relational data stores. The first service that was transferred to NoSQL ( MongoDB ) was the address book. Up to this point, the operation of the post has not had any combat experience with external non-relational storage technologies. This experiment was a success - the address book and today stores contact books in MongoDB.

Meanwhile, Yandex was preparing to launch in Turkey. For the service, this meant mandatory localization support, and the translation of the interface is only a small part of the work. In fact, we prepared and launched a separate product - a Turkish Yandex.Mail - on the same code base. They differ somewhat functionally, which now need to be remembered when developing. To speed up access, we installed servers in Europe from which we are delivering static content - this is how our network has become even more extensive.

There have been changes in the development process. The Web Mail Interface, discussed above, has begun to use Scrum and flexible development technologies. We managed to implement a cycle of continuous display of new functionality, in which each feature or correction of a defect is laid out separately, but it goes through a full production cycle: development, automatic testing, functional testing, load testing, and, finally, a two-stage display into battle - first for one server, and then the entire cluster. This process allows you to quickly respond to the emergence of new product requirements and eliminate defects.

Our days


The launch of Yandex.Disk has significantly changed our equipment operation protocols. The disk, like Mail, uses Mulca to store users' files, but the amount of downloadable information exceeds the postal one by several times. Every month, we add at least five server racks to Mulca to ensure reliable storage of your letters and files. The capacity needs of the core network and the network within the DC have also increased in proportion. The data volumes are such that in the long term we have moved from planning the installation of racks to planning the opening of new data centers.

To revive the mailbox, we developed and launched the technology "Eva" , which collects avatars of users from open sources and displays them. All avatars are scaled, cropped and cached in a special repository based on our other development, Elliptics , a fault-tolerant distributed key-value repository distributed under the GPL license.

Cards organizations, which can be seen in the letters, is another example of a quick solution to the product problem. Information about companies is updated less frequently, about once every two weeks. The rest of the time is just a static issue, and the number of these organizations is small - only a few tens of thousands. To solve the problem of displaying cards, we decided to use cached static html-files that are accessible by hash on behalf of the organization's domain. The task was solved quickly and ensured high performance and fault tolerance.

The Marker technology allowed us to automatically extract useful information from letters of certain types. On this technology, for example, it works to extract information from airline tickets and to highlight an event from a letter. Behind this service is a lot of maintenance work, about a hundred servers and tens of millions of queries every day.

image

In the meantime, major changes were emerging on the Internet. The IPv4 address space has come to an end, and June 6, 2012 is considered IPv6 Day . For this event, we have prepared our internal architecture to work in IPv6 networks. Today, IPv6 works for the mail.yandex.com domain and for sending mail. We are actively working on full IPv6 support in all components and for all of our domains.

XScript, which arranged for us for a long time, was finally excluded from the work of the mail. Now mail is a static web application that communicates with the backend through short API calls. Calls are still handled inside Baida's WMI server and broadcast through one of its modules.

Mail collectors have learned how to collect mail using the IMAP protocol . Much more useful information is transmitted via IMAP, which means that the mail collection process has become much more efficient.

Behind each of our big and small, past, existing and future technologies are people, real professional engineers: system administrators, developers, managers. We are always happy when those who like tasks come to us, who have not been solved yet, who are ready to be at the cutting edge of technology and are not averse to writing a few pages in the history of the development of Yandex.Mail.

Source: https://habr.com/ru/post/210478/


All Articles