In the
past topic, we promised to talk about the internal structure of our platform
www.abcp.ru (SaaS solution). Today we will tell about the most interesting module of the platform - warehouses for storage of price lists.
How was the beginning?
Already in the second year of the platform, we were faced with growing customer needs in the volume of price list lines that need to be stored in a database. At this point, the data was simply stored in one table of ordinary MySQL. It turned out that a typical store wants to connect 7 million items of goods to its store, but cannot pay more than 5,000 rubles a month. We were not ready for such a turn, either technically or economically, so we kept our customers as best we could, and at that time we developed a storage system that could withstand large amounts of data (up to 50 billion records).
')
To constrain system performance with data growth, we performed many software optimizations and even bought serious Hewlett-Packard StorageWorks storage systems for 25 thousand euros, where we placed our MySQL database (calculated on the high speed of the disk system), but these improvements were not long enough was going on.
As it turned out, the stress phase for our team will last more than two years. If we knew the real terms of development, it would be a strong blow to our motivation. But we did not know :) So we did it all.
Choosing a solution
The input data was as follows: the system should keep a lot of records (determined that 50 billion will be more than enough), about half of this data can be updated daily. It is necessary to achieve decentralization in data storage and to allow partial degradation in case of problems with the equipment.
At that time, we only knew about HighLoad that it’s very cool, that there are some conferences on this topic and that the Internet is full of interesting articles. It was fashionable to use solutions in NoSQL projects, say words like “sharding” and all that. We also knew that Oracle is a very powerful database that can scale, and therefore considered such an option.
After much suffering, we dropped:
- Powerful Oracle Subd. Since it is expensive and it is difficult.
- Many different real NoSQL. Since at that time, they were not reliable enough or could not withstand the requirements of the mixed reading / update load.
As a result, won the usual MySQL (innoDB). Proven solution, free, many professionals. We have already stepped on all the rakes with it, so for our team this DBMS was ideal for creating a new version of storing price lists with the stated characteristics. The word “sharding” also helped us.
And that's what we did
The new price list storage system is a distributed version based on simple, inexpensive and understandable MySQL, which we use in NoSQL mode, i.e. just INSERT, UPDATE, DELETE and SELECT. Without any JOIN and with the minimum quantity of conditions in WHERE.
At the moment, the service stores ~ 750 million records of goods sold. During the day, sellers completely update one third of the information (~ 250 million records). The number of UPDATE operations is about 10 times greater than the number of SELECT operations. The speed of filling / updating prices in the service at the moment is 30,000 positions per second, which makes it possible to update the aforementioned ~ 250 million in 10% of the daily time.
The scheme looks like a reference example of the horizontal scaling of the database, but, of course, everything inside the new service is much more complicated. For faster updates, separate modules (also in sharding format) were developed to convert the zoo of price lists into a reference format, and separate diff services that calculate the difference between price lists (this works faster than a full price list update). ). Another solution that fully justified itself was the implementation of the RabbitMQ server as the “heart” of the system. The very use of asynchronous queues prompted our team to a number of ideas that would significantly speed up the operation of the system.
Of course, we have achieved excellent economic performance. For example, if we significantly expand the position storage subsystem in warehouses, then the monthly cost of storing / using one million items will be ~ 100 rubles. Even with the current exchange rate.
Also, selecting the most optimal equipment for the created modules, we received an additional economic gain of about 10-20%.
If you are developing projects in the field of auto parts trade, you can use our API to store price lists and search for them (access is available upon request).
For today, see you in the next topic!