“One of the daily processes accelerates from 3 hours to 15 minutes”: Andrei Bogoslovsky about in-memory computing at SberTech

The words "in-memory computing" sound enticing and futuristic. Who would not want to eliminate the "bottleneck" of the speed of the hard disk, storing and processing data in memory? But in practice there are nuances: for example, because of the volatility of the RAM, the data still needs to be duplicated in a constant, and the gain is obtained when reading, but not when writing. How does it really work with this?

Sberbank Technologies , which is now actively working with Apache Ignite and even invested in the company GridGain, which created it, has a lot of relevant experience. Therefore, we decided to ask a few questions about this experience: of course, it cannot be blindly transferred to any other company, but it is still valuable. Andrei Bogoslovskikh , director of the Competence Center for Business Development Support Platform, answered them.

- Last year at our JPoint conference you said that cooperation with GridGain is at the beginning of a long way - and how are things going now?
')
- A year ago, about 200 people worked in the direction of distributed data processing at Sberbank Technologies, and now there are more than 1,000. Moreover, these are not only specialists who, together with GridGain employees, develop Sberbank's technological platform, but also applied teams that rewrite the existing Sberbank's application software for a new IT platform. The core of the platform should be ready by the end of 2017, in 2018 active implementations and circulation will begin. There is also a lot of work to be done in monitoring the new IT system, improving its reliability and manageability.

- Why did you initially need Apache Ignite? Can you share specific numbers showing how its use affects your tasks?

- In connection with the simplification of access to accounts (mobile bank, Internet bank), the average amount of a transaction decreases, but their number increases dramatically. Just 10 years ago, cellular communication was mainly paid at the offices of the telecoms operator for 1-2 months in advance. Now customers pay for communication through the mobile bank several times a month, but in smaller amounts. This is just one example of an ever-increasing number of transactions.

Therefore, high system performance is extremely important to us, which can be helped by Apache Ignite. Testing shows that with the current system one of Sberbank’s daily batch processes takes about three hours, and Apache Ignite has 15 minutes. Also, the performance of more than 10,000 client card transactions per second against the current 3,000 operations per second at the peak (and usually around 500) is expected from the new system.

- In your opinion, Apache Ignite is suitable for those who have to work with such gigantic scales, or for small companies it can be no less useful?

- The product is extremely interesting. Apache Ignite is applicable not only in large business, but is already ready for use "right out of the box" in medium and small companies - this is certainly its great advantage. And by our participation in it as an open source project, we are trying to make Sberbank's IT solutions more accessible to all, the level of maturity, reliability and efficiency.

- That is, SberTech commits in Apache Ignite, although you closely cooperate with the GridGain company behind it, so surely they themselves take your needs into account when developing?

- At the end of last year, the company management decided to create a separate direction for the development of open source projects that are key in the IT landscape of the company.

The benefit of the company in the development of open source is to increase the availability of products for a wide range. SberTech has a large and truly unique experience in developing projects that meet very stringent requirements for reliability, availability, performance, scalability. This experience is difficult to get anywhere else (tasks of this level are set for companies of the size of Amazon, Google, etc.). Few people in the developer community have come across our use cases.

We decided to develop key streams in open source: reliability, performance, elimination of technological debt, expansion of the ecosystem and opportunities for the integration of projects.

Apache Ignite was the first such open source project, and we closely interact with GridGain. There are many requirements for project implementation, therefore SberTech has separate teams in this area.

- And what is the nature of SberTech commits in Apache Ignite - something related to banking specifics?

- We have no task to develop open source projects for the business specifics of Sberbank. They only introduced functionality that will be useful for all project users, including potential customers. If we talk about the current participation, the team worked on the releases of Apache Ignite 1.9 and 2.0. Kommita can be divided into two categories - bug fixes and new functionality, the separation is approximately equal. With regard to the new functional, special attention is paid to complex scenarios of transactionality, increasing the reliability and manageability of the cluster.

- What can you, based on your experience with Apache Ignite, recommend to other people who are not yet using the project, but are interested in them? In which cases it is well suited, and in which problems may arise?

- I advise you not to be afraid of change, but to start small. Apache Ignite has broad functionality that reduces latency and improves the performance of your applications. For large companies, the key may be the possibility of ensuring the growth of the business, the volume of calculations and the processed data while reducing the TCO. It is possible to use the product both in simple tasks (for example, speeding up the response of services and reducing the time to access data), and on more complex tasks (distributed computing in memory, distributed SQL database). DML support has already been sufficiently implemented; full DDL support for community plans should appear soon. If we talk about completely new features, then this is ML support. A beta version of the machine learning engine, including support for distributed mathematics, is included in the Ignite 2.0 release. Until the end of 2017, the development of the engine is planned, including the implementation of individual algorithms out of the box, Python, R, etc. Now Apache Ignite can be used as a distributed machine learning platform for big data with all the advantages of storage and calculation in memory.

As for the problems, in my experience, they can arise from a lack of competence, since the project is still quite complicated. Therefore, I can advise you to connect to our development community.

- Working with Apache Ignite is not the first year, you see the situation with in-memory computing in dynamics - how is everything going?

- According to my feelings, in-memory computing from the field of exotic goes into the area of standard development tools. For example, NSPK created a single payment system for the MIR card based on grid technologies. Most of the major developers of payment systems considered the possibility of using grid technologies in the next versions of their systems.

“In terms of performance, in-memory computing can noticeably win, but at the same time, large amounts of RAM are associated with high prices. Can you give any numbers that give an idea how, in practice, this affects your “iron” costs compared to traditional approaches?

- The cost of a cluster, which is equivalent to the size of the disks of the current core-banking systems, is 1/5 of the price of Hi-End solutions. At the same time, the x86 cluster has significantly more processor power, is better adapted for mass-parallel operations, and scales horizontally.

The cluster is expected to consume more energy, from 30 to 200% depending on the nature of the load. It also takes about 5 times more space in the data center compared to RISC solutions.

- And if you look at how the cost of such solutions has changed over time, and try to look into the future - is it possible that in-memory computing expects growth due to cost reduction?

- I am not ready to give financial forecasts, but every year the operational memory becomes cheaper, and the energy consumption by memory decreases. Who would have previously thought that SSD drives will force HDD out of the top segment? I think that something similar will be with both RAM and NvRAM solutions, which will give new impetus to the development of in-memory computing.

Source: https://habr.com/ru/post/335506/

All Articles

“One of the daily processes accelerates from 3 hours to 15 minutes”: Andrei Bogoslovsky about in-memory computing at SberTech

More articles: