Yac 2011: Technical Report

Oh, once, yes again,
Yes, yet another time ...

Not so long ago, the Yandex YaC 2011 conference ended and now that the recordings of the speeches became available, I want to provide you with a technical report on its attendance. In the report, I focused on the information that you can get by looking at the recording of a report, and decide whether to spend time on it. For some topics, I added additional links to key resources, as well as, based on the communication with the authors, described the devices of two NoSQL technologies of Yandex: Elliptics Network and repositories of letters in Yandex mail.

So, Yac 2011, as it were.

Introduction

The rays of the leaving summer determined the beginning of my Monday. A wonderful early autumn morning, with each step, improved my mood as I walked to the building of the World Trade Center, in which on September 19, 2011, the second technological conference organized by Yandex took place.
')
Just a year ago, Yandex decided to hold a new conference every year, modestly called, Yet another Conference, and “another conference” turned out to be head and shoulders above the overwhelming majority of those that can be reached in the space around us. On the one hand, the quality of key reports and their direct focus on established technical specialists set a very high standard for the remaining technical conferences, and on the other hand, Yandex created a special atmosphere by releasing its key specialists and giving them to the “torn apart crowd” . After the speech, many speakers were surrounded by those who wanted to talk, and they spent hours answering questions, calling things by their names and revealing many secrets that were not customary to talk about on the stage.

No less important is the fact that the records of the YaC reports are made publicly available, and everyone can see and evaluate them on their own. Of course, the video will not allow you to communicate directly with the speaker and learn all the most interesting, but, nevertheless, it provides the most valuable resource for self-education.

The purpose of this report will not be a retelling of the reports, since you yourself can see everything. Instead, it will be more useful to briefly describe the essence of the reports, share your thoughts and impressions and recommend those reports that you should spend your time on and watch the recording. However, in the end I will move away a bit from this rule in order to share what was not recorded.

Search technology "Spectrum"

The conference was opened by a report by Andrei Plakhov, which was devoted to a new system launched at the end of 2010, assessing the relevance of a response to a search query. Andrei dedicated his story to the formulation of the problem of ambiguous queries and described its solution in general principles.

Yandex processes 5 billion requests per month and many of them are structured in such a way that even a person is not easy to understand exactly what the user wanted to know. For example, on request Jaguar, response options may concern an animal, a car, and a drink. Based on the analysis of the request flow, the system finds possible extensions of the current request, for example, “Jaguar engine capacity” or “alcohol content in the Jaguar”. Among all possible extensions, select those that correspond to the category of the request (for example, a drink, something harmful) and possible extensions are assigned weights based on the potential needs of the category. The system contains a limited number of categories, such as movies, books, people, gadgets, etc., which are determined manually.

Record of the report is worth a look for a general understanding of the problem of ambiguous queries and approximate approaches to its solution.

Cross-platform development for mobile devices

Dmitry Zhestilevsky, told about the Platform Abstraction Layer (PAL), which is being developed in Yandex. PAL allows you to create Native C ++ applications for iOS, Android, Symbian and other systems without duplicating code. PAL was based on OpenKODE: a set of open standards that describe the software platform of a mobile device. OpenKODE includes APIs for interacting with the OpenKODE Core operating system (largely POSIX replicators), OpenGL ES, OpenSL ES, etc. In Yandex, some of the OpenKODE standards for each platform were implemented, another part was already implemented on the platforms initially (for example, OpenGL ES) and got a universal application development environment. PAL allows you to write and debug code under Windows and immediately compile it for iOS and Android. The OpenKODE standards set provides rich possibilities, but sometimes there are tasks for which to optimally solve, the OpenKODE capabilities become insufficient and then the need arises to create your own OpenKODE extensions.

The OpenKODE standard uses the C language and, on the one hand, it is logical to write extensions without using C ++, but on the other hand, over time, it became clear that it is not very convenient to create an extension interface on pure C when the rest of the code uses C ++. Now extensions are full-fledged C ++ objects and there is no additional need to create RAII wrappers over C code.

PAL is not a unique technology of Yandex and other companies offer equally interesting counterparts. For example, the Marmalade SDK, which was mentioned in the report, provides a very complete and, from my point of view, the most promising solution. Nevertheless, Yandex began to write its system, primarily because it provided independence from third-party companies and provided unlimited expansion options.

Record of the report is worth looking at in order to understand the problems that Yandex has encountered, but in order to study the approaches to the organization of PAL, I recommend exploring the sites dedicated to the OpenKODE and Marmalade SDK.

In search of mathematics

The beautiful title of the report could not leave me indifferent, and, anticipating, I snuck closer to the scene. The story was devoted to the system Nigma-Mathematics, the domestic analogue of Wolfram Alpha. The report was devoted to a general description of the structure of the service, which turned out to be quite typical.

The report is worth looking at in order to understand the general principles of the organization of web services, but if you are interested in subtleties, then it is better to refer to other resources.

C ++ Application Development for Android

Yuri Bereza, another PAL developer, talked about the subtleties of Android development: Java calls from C ++ and C ++ calls from Java, Boost compilation, debugging features, build, etc. As a result, he got a very useful report from a practical point of view.

The record is definitely worth a look for those who are going or are already writing in C ++ for Androyd. It was one of the most practical speeches.

How we Architected Cloud9 IDE for scale on NodeJS

JavaScript is not one of my interests, and in this report I’d whiled away the time until the first wave of those willing to dine would resolve. But, nevertheless, something I still learned.

Cloud9 has created a JavaScript IDE that runs directly in the browser. Indeed, the IDE can highlight the syntax, work with version control systems, fully debug: set breakpoints, watch the value of variables, callstack, etc.

I looked, hungry, at all this celebration of HTML 5 and thought about the future of computer technology, the ubiquitous Internet, smart devices, autopilot in cars, thermonuclear fusion for all this electronic happiness.

However, with this IDE I found one big “flaw”: the complete lack of integration with Twitter and Facebook. Therefore, who knows if an IDE programmer will choose, torn between such wonderful services?

The report seemed to me largely marketing and therefore I recommend it to look only to those who are really interested in the possibilities of the project.

Why would an ordinary programmer know languages in which almost no one writes

This is not a report, this is an epic, Odyssey with Ramayana and A Dream in the Red Tower with Bhagavad-gita.

No, really, if on the topic of programming languages you would have to create something imperishable and tell in 40 minutes, it would look about the same.

I advise you to look, always at bedtime, in a good mood and in a dispersed mind.

C ++ 11 is the new C ++ language standard

The path of the standard was thorny and long, but after 8 years we can finally say that this path has been completed. Everyone is accustomed to explaining such a long time with the complexities of the language, but maybe this is not the case? Could it be the intervention of otherworldly forces? Judge for yourself, the speaker was simply not allowed into Russia. And if we, here, have long been accustomed to living with our otherworldly forces, then what can they, members of the committee, do against such a disaster?

The report of Dave Abrams was on Skype, which in itself was funny. Handy technical tricks provided two-way video communication: with a slide show, questions at the end, etc. Dave, it was probably strange to watch a huge audience shot with a small laptop camera, and we look at a person speaking in front of a large number of people at home.

I was talking about new features of C ++, and I don’t think to watch the video, considering the technical problems of communication. Better to spend your time exploring a C ++ 11 Wikipedia article.

Unit Testing and Google Mock

Unit tests will really benefit only if they are fast, reliable and accurate. But how to write a unit-test system, depending on the database, network, or heavy calculations? One of the answers to this question is to test only a specific module, and emulate the operation of all dependent modules using Mock objects.

A mock is an object that represents a specific fictitious implementation of an interface intended solely for testing.

Writing a Mock-object for each test can often be quite expensive and therefore there is a desire to automate this process in some way. There are many Mock libraries for Java, for which, because of the support of introspection, it is much easier to write such libraries. And it seemed to me that convenient C Mock libraries would never appear for C ++, but Google introduced Google Mock.

Immediately after the report, the library seemed to me "cadavre, dissatisfied gastro." Judge for yourself, using template metaprogramming, you created your own declarative language, which describes the behavior of the Mock object. And the question arises: will it take longer to delve into the documentation, in an attempt to explain what exactly Mock should do, instead of quickly writing, how should it do it?

The first impression passed after I got a little into the Google Mock documentation. The library is organized quite reasonably, easy to learn and now it seems to me, at least, interesting.

It is difficult to say whether to watch the video or not. It seemed to me that the syntax needed some time to get used to, and the speaker immediately goes into the wilds and spoils the first impression.

Animal control: tools for managing and monitoring distributed systems from Cloudera

Cloudera specializes in technologies built around Hadoop, a free project of the Apache Software Foundation designed to store and process large amounts of data on clusters of thousands of individual nodes. Hadoop, in fact, is an ecosystem that combines several separate projects: a distributed HDFS file system, the implementation of the MapReduce paradigm, a non-relational HBase database, and others.

Recently, the popularity of Hadoop is growing rapidly. The number of companies solving their tasks using Hadoop is increasing, the quality of internal implementation is improving, and more and more data is being processed by Hadoop clusters. Clusters that store up to 80 petabytes of data are now commonplace.

The report seemed to me to some extent marketing and aimed at specialists directly working with Hadoop, so I think you should not waste your time on it. Instead, I recommend the speech of Konstantin Shvachko from Yahoo, who at the last conference spoke very interesting about Hadoop.

Hadoop Facebook Scalability

The topic of the report was again Hadoop, or rather its modification aimed at increasing scalability, which was created in Facebook. The fact is that Facebook uses a fork of the old version of Hadoop and at some point decided to fix some problems of scalability of its system, which by that time had already been fixed by the community in the main branch.

The report was devoted to how Facebook fixed its system and Hadoop will be useful for specialists, but those who are superficially familiar with the technology will most likely not find anything useful in the report.

The most sophisticated techniques used by bootkits and polymorphic viruses

The Kaspersky Lab report began with elements of the university course Operating Systems, but quickly moved to the finer points of the BIOS device and the OS boot process. And with a listing of the values of the addresses of their functionality and tricks that are used by modern malware.

Very specific information, at a fairly deep level, so I recommend watching only true connoisseurs who know a lot about such things.

Stand Elliptics Network

During the day, a technological exhibition was held in the conference hall where one could try to write an application using Yandex Mobile MapKit, look at new smartphones, 3D TVs, win some prizes - in general, the classic red corner of marketing. However, among all the stands there were two that really interested me. I will talk about them in a bit more detail since, in fact, there were no reports on the systems that showed them. At the stands they simply answered questions from people passing by.

Both booths were devoted to storing and processing large amounts of data in NoSQL storages (the term NoSQL does not consider disclaimer from the SQL language, but storing information in databases built on a model other than relational). Relational databases have become the most common way to store and process large amounts of structured data. However, the speed, integrity, and ease of application development on modern SQL DBMS is achieved through a compromise with other important factors: scalability and availability. Speaking of SQL DBMS, the term vertical scalability is often used, the essence of which boils down to the fact that from a certain point in the development of a project, the cost of equipment when increasing performance by one and the same amount increases rapidly, and gradually becomes prohibitively high. The situation is even worse with accessibility: the failure of one node can lead to the server stopping for a long time and result in large financial losses. To solve the problems described, data storage systems have emerged that, sacrificing functionality, gain in scalability and availability.

One example of alternative SQL storages is the Elliptics Network project, which in Yandex is used to store photos in the Yandex.Fotki service and tiles in Yandex.Maps. Elliptics Network belongs to the Distributed Hash Table (DHT) class, which, compared to relational databases, provides a rather meager set of operations: storing and retrieving unstructured binary data by a unique key (this class of storages is called key-value storages), but near unique properties. The cluster of Elliptics Network is organized on the principle of a peer-to-peer (P2P) network in which all nodes perform the same operations and no node is allocated among the others. The key of the record can be an arbitrary entity by which the hash function can be calculated (in this case, SHA512). Using a crypto-resistant hash function allows you not to consider the probability of its collision and to remove the problem of non-unique keys. In addition, the storage may not know anything about the true type of the key and the method of its generation, that is, perform all operations using only the hash. Each node in the cluster selects several segments from the ring of hash values for which it will respond and modify, a globally available (synchronized between nodes) query query table. When accessing data, the client generates a unique hash value by key, which is used in the routing table to find the node responsible for the range in which the value falls. After receiving the address, the client can directly contact the site and request data from it.

The lack of centralized elements allows to increase productivity in proportion by adding new nodes of similar configuration. This method of scalability is called horizontal. It allows you to significantly save money and create services, the functioning of which would be impossible when using relational databases.

Each node in the cluster can fail at any time and when there are a lot of nodes constantly there is a situation in which some of the nodes are inaccessible. In order for the failure of the node did not stop the entire cluster, the replication system is introduced. The operation of such a system is based on the fact that several nodes are responsible for the same interval of the key space and that the recording occurs synchronously to all available replicas. If all the nodes responsible for a certain range of hash values fail, then reading from this range becomes impossible, but writing continues to the nodes responsible for the adjacent ranges. Thus, the failure of an arbitrary number of nodes can make only part of the data unavailable for reading and only those for which all possible copies have failed (the system is usually set up for double or triple backup), but the impossibility of recording occurs only when all nodes fail. cluster

A year ago, the Elliptics Network supported the versioning of data, but now it has been cut off as useless. In addition, Elliptics Network allows you to use interval requests, that is, to obtain all data whose key lies in the range from X to Y. Such functionality obviously goes beyond the key-value storages and interferes with the slender DHT architecture. It is implemented as follows: the low bytes are discarded from the SHA512 key and are used as the sorting index. Each node stores in memory a red-black tree of hash keys and, naturally, allows for interval queries. The developers say that they do not consider interval requests, the range of which may overlap several nodes, although from my point of view this could be a scaling problem, not to mention that interference with the key generation mechanism prevents the data from being evenly distributed over the nodes. . And, hypothetically, a sufficiently large and popular data interval, completely falling on one machine, can easily slow it down to a complete stop.

There are several technologies around the Elliptics Network. The Eblob library, which "works even better than it sounds," allows you to organize on the disk storage of unstructured binary data (BLOBs) and get quick access to them (regular file systems do not cope with the load in situations where the number of files begins to be measured in millions ). The PohmelFS file system, based on the Elliptics Network, in which the key is the file name and added additional mechanisms to maintain integrity, and PohmelFS is compatible with the POSIX standard and can be mounted as a standard Linux file system.

However, my attitude to the Elliptics Network, albeit with interest, is rather skeptical. In essence, it is difficult to talk about the scalability of the Elliptics Network, since it is used in a cluster consisting of a maximum of a couple of dozen fairly high-performance nodes. An alternative to the project is Hadoop, Apache Casandra (mentioned by Elliptics author a year ago, a year ago, but now they were talking about its insufficient productivity) and others.

About Elliptics Network, lively but superficially, the author of the project Yevgeny Polyakov told about the past Yac. Video available.
Download the source code of Elliptics Network and read the blog on the project website:
www.ioremap.net
Slides that were circling in the lobby at the current conference:
www.slideshare.net/AntonKortunov/yac-2011-elliptics-network
In addition, I highly recommend an article about another DHT implementation from Amazon called Dynamo:
s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf

Stand Yandex. Mail

The second representative of key-value storages at the conference was the Mulca project, which is a distributed file system used to store letters in Yandex.Mail. The cluster stores 22 trillion records, with a total volume of 6.5 petabytes, on more than a thousand nodes and is constantly growing.

The system, like the Elliptics Network, is built on hash tables, but uses completely different principles. The write key is hashed by two different, non-crypto-resistant hash functions, and each node in memory implements a double hash table. The first level is an array, let's call it H1, pointers equal in size to several million elements. Each pointer of such an array refers to the list of the second level H2 of a dozen elements, which in turn refers to the place on the disk where the data lies. With this construction, the search is performed as follows: The value of the first hash function (more precisely, its remainder of the division by the size H1) is used to obtain the offset in the array H1 and to obtain the address of the list H2. The list of H2 linearly searches for an element that matches the value of the second hash function. After finding such an element, data is read from the disk and only after that the real keys are compared. If the keys do not match, then continue the search in the list H2 so on. The H2 list is sorted by time and therefore access to new emails is faster.

Due to the fact that the keys are not stored in memory, the developers managed to reduce the memory requirements from 50 to 12 GB per node. On the other hand, such an approach does not lead to a decrease in productivity, since there are now units of double collisions in the system and there is not a single triple collision. Those. the search for the overwhelming number of letters occurs in one reading from the disk, and only a few letters require two readings.

Each letter is stored on two nodes in different data centers, and the meta information on the letters is stored in the Oracle DBMS cluster and the key of the letter is arranged in such a way that information about the nodes on which the letter is written is added to it. This allows, knowing the key of the letter, to read directly from the node that stores it.

The record is managed by special nodes, called balancers, which monitor the workload of the remaining nodes and, when generating the key of the new letter, use the nodes with the least workload.

The main advantage of the system in Yandex.Mail before the Elliptics Network is that adding a new node does not transfer data between nodes. In the Elliptics Network, on the other hand, adding a node causes the redistribution of responsibility for the ranges of the SHA512 ring between nodes, which, in turn, leads to a massive transfer of data between the nodes.

I unfortunately did not find any additional materials devoted to the internal structure of Yandex.Mail.

Conclusion

This year, in my subjective opinion, Yandex did not reach the level that he himself had set a year ago: the reports were weaker, the marketing became more, and the speakers were less eager to communicate. But even with such reservations, YaC remains one of the best technical conferences in Moscow, and visiting it will definitely be useful for technicians who are trying to expand their horizons.

Links

Records:
yac2011.yandex.ru/archive2011/video1
yac2011.yandex.ru/archive2011/video2
yac2011.yandex.ru/archive2011/video3

Records of last year’s reports:
yac2011.yandex.ru/archive2010/materials

Source: https://habr.com/ru/post/132205/

All Articles