📜 ⬆️ ⬇️

In-memory-data-grid. Modes of operation, indexes, locks

I continue a short series of articles on the In-memory-data-grid.
The first article revealed the very concept of IMDG without concrete examples and implementation details. Today we dig a little deeper.

Modes of IMDG


The modes of operation are not fundamentally different depending on the specific IMDG solution, therefore, the following is true for the concept of IMDG as a whole, and not for individual solutions.

1. Local mode

In this mode, an IMDG cluster consists of only one node. It is mainly used for debugging purposes only.

2. Replicated mode

In replicated mode, the complete data set is replicated to each node in the cluster.
')
image

Pros :Cons :
So that each PUT does not take so much time, you can perform it asynchronously (many IMDGs provide such an opportunity), but then only you are responsible for the consistency of the data. Therefore, I would not use this mode of operation in write intensive systems.

3. Distributed mode

The most interesting and used mode of the IMDG, in which you can appreciate all the positive qualities of this concept.

image

The description of this mode was the basis of the previous article .

Indices


To search for data in IMDG, we use the inverted index search.

1. Oracle Coherence

Indexes are represented by objects that implement the MapIndex interface.
Currently (Oracle Coherence 3.7), 2 index implementations are available:The index is distributed, i.e. on each cluster node, only the data that it contains is indexed. When executing a request to the entire cluster, each node separately calculates its part of the overall response, then these parts are transferred to the node from which the request was made, where they are collected into one general response.
If the request requires accessing several indices at once, then first a response is formed for each of the indices, and then those sets that turned out to intersect with each other in order to get the final result. This intersection does not occur instantaneously, so before you make a query that requires accessing several indexes, you need to think about whether this will lead to the intersection of huge sets of keys.

Pros :
Cons :
2. JBoss Infinispan

Here, Apache Lucene (open source full text search engine) and HibernateSearch (which is based on the same Lucene) are used to search for caches.
This choice has significant drawbacks, but there are also advantages.
Pros :Cons :
3. VMWare Gemfire

When indexing data, 2 types of index are used: Primary Key Index and Functional Index .

The difference between the two is that the Primary Key Index allows you to check the value of an indexed attribute only for equality of some constant, and the Functional Index allows you to perform a comparison. For example, you can select objects with a field someField> 10.

Index update can be performed synchronously (provides consistency) or asynchronously (index update speed).

In general, the pros and cons are the same as those of Oracle Coherence.

4. Hazelcast

It does not have a division of the index into types, but the principle of operation of the indices is the same as that of Oracle Coherence, so it makes no sense to write them separately.

Locks


If your application allows the possibility of multi-stream writing of data to an object, then a lock mechanism is usually used to ensure data integrity. And this mechanism works reliably if you are within the same machine. But what if your data is distributed across a cluster?

In this case, IMDG solutions have distributed lock implementations.
Distributed locking (distributed lock) is a lock that is available on all nodes of the cluster and has the same state on all of these nodes. Those. impossible situation in which 2 threads on different nodes simultaneously owned the same lock.
Distributed locking ensures synchronization of data access in a cluster.

Conclusion


In the next article I will try to talk about the results of comparing different IMDG and NoSQL solutions, but, as you understand, this will take some time, so do not wait for the article before mid-September. I invite everyone to participate in the discussion of the results :)

Source: https://habr.com/ru/post/126973/


All Articles