📜 ⬆️ ⬇️

“Apache Ignite is a high-tech product”: GridGain Systems about in-memory computing, open source, the Russian market and not only



About the project Apache Ignite heard more and more. But, as Vladimir Ozerov , one of its developers, noted earlier on Habré, in a nutshell it’s difficult to describe the project - and as a result, many questions remain, starting with the most basic ones. What is the project at all? How does Apache Ignite and GridGain compare? How do the concepts of "in-memory data grid" and "in-memory data fabric"?

The JBreak and JPoint 2017 program included speakers from GridGain, and the company itself sponsored both conferences - and right before JBreak we asked many questions. And answered them:
')


Vladimir Ozerov (architect)



- Let's start with a simple one: what is a project and how is it different from similar products?

- Apache Ignite is a universal platform for distributed in-memory computing. The first components that appeared about 10 years ago were the distributed cache (data grid) and map-reduce (compute grid). At that time, both we and our competitors positioned their solutions as “distributed cache”. Over time, business requirements grew, we added new modules. Among the main ones: support for ANSI SQL'99, streaming data (streaming), deploying custom applications in a cluster (service grid), the ability to save objects without having their Java classes on the server (binary objects), integration with a large number of third-party interfaces and platforms (web sessions, Spring cache, Hibernate L2 cache, Hadoop, Spark, Kafka, JDBC, ODBC, etc.).

At some point, the well-established notions of “distributed cache” and “in-memory data grid” no longer reflect the essence of the product. Then the term "in-memory data fabric" appeared - this is software for distributed computing in memory, with a large number of interfaces for various data sources and their consumers. This is what sets us apart from similar solutions that are not in a hurry to go beyond the term "in-memory data grid".

- The project did not immediately come to open source - how did it happen?

- At first, the product developed as a closed commercial solution GridGain. Realizing the potential of open source, we opened most of the source code on GitHub, but control over the development remained in the hands of the company. This is a common business model: look, use, but do not interfere. But we went further. In 2015, the open part was transferred to the control of the Apache Software Foundation. The decision was not easy: we lost control of the code, and were forced to look for a new name. So Apache Ignite appeared. The bet was made on the fact that the Apache brand and now honest open source will give a serious impetus to the project, increasing its popularity. And we were not mistaken: the number of users began to grow exponentially.

What is more interesting, people began to actively invest in the project: to generate ideas, correct errors, improve documentation, develop new features and modules, write books and speak at conferences. This is just fantastic! We continue our efforts to attract new contributors. Anyone reading this interview can contribute by gaining experience in developing a truly complex system.

- It sounds great, but where is the money in this scheme? How do GridGain and Apache Ignite compare?

- Commercial companies can rarely afford to work with pure open source, especially in critical business applications. Who will add the missing functionality, quickly fix the bug and answer questions? All this is available as part of the basic commercial product - GridGain Professional Edition. This is the Apache Ignite code base plus support and hot fixes.

In addition, business often places increased demands on security, resiliency and high availability. This functionality is implemented in GridGain Enterprise Edition. It includes Community Edition plus a set of additional features: user authentication and authorization (security), data replication between data centers (data center replication), the ability to upgrade the GridGain version without shutting down the cluster (rolling upgrades), as well as utilities for monitoring system status .

- We dealt with the products, let's talk about the technical component. How do you conduct R & D?

- Technical discussions take place in public, on the dev-list. We set the task, collect opinions, then determine the architecture, and a specific plan of action. Each component has a mentor who knows the corresponding functionality well. In difficult cases, it is he who collects everything in a bunch. To make informed decisions often have to work with literature and scientific publications. Sometimes there are simply no ready answers, and you become a pioneer. This makes our work fascinating.

- How do you test the code?

- This is a multi-level process. The developer writes the code, and covers it with tests. Moreover, it is not only unit-tests, but also integration testing. You check the code in isolation, then on one node, on several, you model multithreading and failures. Tests run on a public CI server (TeamCity). Next, the code passes a mandatory peer review. The second part of the process is managed by a QA department. They run the functionality through a set of their test cases, mostly automatic (Python, Java). In combination, this allows us to keep the quality at the required level.

- On the question of “anyone can contribute” - there is a feeling that, after all, not everyone, since the project is not an ordinary one. What are the requirements for developers GridGain, and where do you find people?

- Indeed, Apache Ignite is a high-tech product, so fundamental knowledge is important. Starting with classical algorithms, data structures and multithreading, and ending with the principles of building distributed systems and the internal structure of the DBMS. Finding a person with such baggage is a difficult task. We understand this, so we are looking for people with a good foundation, and we are pulling up to the necessary level already in battle. The openness of the process helps a lot. Any employee can participate in the architectural discussions on the dev-list. In addition to this, we have a fairly flat organizational structure, everyone writes code, which also contributes to adaptation.

Alexey Dmitriev (General Director of the Russian Department / VP of Engineering)



“When viewed from the side, it seems that GridGain has overtaken its time: when it appeared, interest in in-memory computing was much lower than the current one, but now its time comes. Do you agree with this?

- Rather, I would say that the company came exactly at the time when the market was ready for a new technology. The founders of the company at one time managed to correctly capture technological trends and take the right steps. Therefore, now that this technology is in great demand, the company has the opportunity to come to its customers with a ready-made and mature product.

- How do you see the market trends - do you expect further rapid growth?

- The amount of data that needs to be processed is growing, and the need to process this data is quickly relevant now for almost all large companies in the financial sector, as well as in telecom and other areas. Our technology allows you to scale the solution almost unlimited. At the moment, there are more and faster hardware solutions for both RAM and disk. Our project is architecturally ready for these new solutions and their introduction will certainly open up new horizons for us. Therefore, yes, I expect further rapid growth and activity in the field of distributed solutions, and in particular, around our product.

- Do you have large clients like Sberbank - and your solution suits giants with big budgets, or small companies too?

- This solution fits all. Sberbank has a lot of data, so they use the colossal hardware complex to operate with such data quickly. Small companies, as a rule, have less data and the costs are much more modest. In addition, our product now allows us to differentiate data on the frequency of access to them and keep in RAM only those to which applications access most frequently. And the most important thing: our product works not only on dorog multiprocessor servers, but also on simple cheap office computers. You just need to connect them to one network and put GridGain on them. You do not need any super-expensive specific "iron".

- GridGain has a curious relationship with Russia: the company is American, but the founders are Russian-speaking, the development is largely carried out by Russian programmers, and in 2016, Sberbank became one of the investors and key customers. How much is Russia prioritized for you both as a market for sales and as a place to search for new personnel?

- GridGain is not the first known American company with Russian founders. If you want to develop a new and complex product very quickly and efficiently, then Russia will be in the first place among the countries where you can find engineers who can do it, and cheaper than in the US or Europe. We have a very strong engineering team in Russia, and the main development is carried out here. And this is not directly dependent on the fact that Sberbank is one of our largest clients, although it is an undoubted advantage here.

Sberbank, in my opinion, is one of the most dynamically developing client banks in the IT sector in the world. He is a role model for many companies in both Europe and Russia. Many of our potential customers in Russia are now carefully looking at our collaboration with Sberbank, so I think that in terms of sales, this market is one of the most important for us.

Irina Tischenko (HR Director)



- GridGain is written in Java - respectively, and the developers you have javista, or all smarter?

“Of course, the core of the product is hundreds of thousands of lines of Java code. But there are nuances. For example, modules that integrate our platform with C ++ and .NET code are written, respectively, in these languages. Our Krasnoyarsk team is developing the Apache Ignite / GridGain management and monitoring console: this application is created using various web technologies.
There are in GridGain and developers, equally well programming in different languages.

But the main thing is that our employees are talented engineers with excellent algorithmic training. One of the main tasks of the development team is performance. Solving performance optimization problems in a distributed multi-threaded system requires very serious mental effort. Therefore, when choosing employees, we, first of all, focus on their ability to think, and only then on specific knowledge.

- It is known that back in 2006, the founders of GridGain connected St. Petersburgers to the development - and in which cities is the company currently developing?

The main development office GridGain is still located in St. Petersburg. But Peter, of course, Russian geography of our company is not limited. So, the web management console is developed in Krasnoyarsk. Moscow and Novosibirsk teams GridGain, despite the fact that their range of functions is much broader, are also involved in product development.

The head office of GridGain is consistently located in Silicon Valley. And, in general, taking into account the active development of our company, we can safely assume that GridGain will need developers in Western Europe and even in Asia.

- After the previous one: is there a relocation between different cities?

- Of course! As follows from the previous answer, there are no borders for GridGain! Our team employs guys from almost all parts of Russia and not only Russia.
We are happy to help in relocation, and, as I said, are not constrained in its directions.

- What does GridGain expect from participation in JPoint and JBreak? GridGain has the above mentioned difficulty “to describe the product in two words is problematic” - and conferences, where the topic is discussed in more detail than two words, help convey the essence of the project to other developers?

ignite.apache.org - instead of a thousand words! :) But seriously, yes, our guys really have something to talk about. And these events are a good reason for our meeting with those who are interested and helpful in the experience of the GridGain team, so to speak, first-hand.

We are going to Novosibirsk and Moscow with a report on the scalability of distributed systems and a team of our development staff who will be happy to have a dialogue with conference participants at the GridGain stand. Therefore, we hope for a flurry of questions. We are not afraid of it, because we are used to working in multithreading mode :)




Novosibirsk has already seen the above-mentioned report on JBreak (and got carried away with the survey from the company's booklet - GridGain will publish the answers to it next week in its habra account ). And all this is only to be seen by visitors to the Moscow JPoint - and so far, while waiting, we can recall Vladimir Ozerov’s report “(Almost) non-blocking synchronization” from last year’s JPoint:

Source: https://habr.com/ru/post/325672/


All Articles