About Oracle Coherence in Russian: Why is it needed?

In this article you will find the answer to the question posed, as well as it will explain the basic concepts of distributed computing technology in Oracle Coherence. This is an introductory article whose main task is to explain the “dictionary” of terms used by the Coherence developers. I will provide terms including in English, to facilitate the search for information for those who want to learn more in this direction.
For those to whom this topic is interesting, I ask under the cat

So, suppose that we have a task to calculate quickly some kind of task for a large amount of data. What does “large volume” of data mean in general? This is such a volume, which does not make sense to upload to the client due to the fact that the client cannot fit all the necessary data on his side. The dilemma is how to get the result, without downloading the entire amount of data to the client. A possible solution would be to make subsets of a large data set and collect intermediate results in a loop on the client. Such a solution is good for everyone, except that the sequential execution will be much longer than the execution of the entire set at a time (time will be spent on the request / response, preparing a subset of data and sending subsets of data to the client for counting). Also, during the execution of this sequential operation, the data may become outdated. That is, intuitively, we understand that the data must be processed where it is (without sending it over the network), and, moreover, simultaneously across the entire set.
That's where solutions like Oracle Coherence, Hadoop, Gemfire, etc. come to the rescue.

Let's go through the basics of Oracle Coherence.
We read the documentation and see the following: “Oracle Coherence is an in-memory data grid solution that enables ...”.
“In memory” - this means that the data is kept in the computer’s memory (it can be on the disk, but more on that later).
“Data grid solution” - this means that the data is distributed in a cluster and not concentrated in one place.

But first things first. Let's first understand what “building blocks” are available to accomplish the tasks.
')

Nod

A node is simply a java process (running the com.tangosol.net.DefaultCacheServer class) with coherence.jar in the classpath and configuration files. You can run multiple nodes on the same / different machines, under the same or different users without restrictions. That is, it is important to understand that this is just a java process and it can / should be debugged in the same way as any server application that you write.

Cluster

A cluster is a collection of several nodes. Nodes in the default configuration will find each other automatically over multicast. If necessary, you can configure WKA (well known addresses), if the system administrators are unhappy that you “scored the entire network with your multicast”. In the cluster there is always a master node (senior member) who looks at what happens to the cluster (how many nodes there are, which of them store data, where to copy data, if one of the nodes has fallen, etc.). Master node is the first node that started. If the master node “fell” for some reason, the next master node is automatically assigned. It should be noted that during the processing of data the master node is not used. Calculations are performed on nodes where the required data lie. As a rule, nodes are divided by purpose: proxy, computational, and data storage nodes. If at all all the nodes “fell”, then you have no data. That is, you need to think in advance how the data / changes will be saved and how to load after the system boots.
During the development process, it is recommended to configure a development environment similar to production. This will allow you to find many serialization errors and communication between nodes before you release the production version.

Node configuration

By default, configuration files are not needed; in this case, files from coherence.jar will be used. The default configuration files are not suitable for production systems, they need to be changed for a specific task. Some even recommend deleting these files from the coherence.jar file.
There are 3 main configuration files with which you will have to work:
tangosol-coherence.xml - this file is responsible for the configuration of the cluster as a whole. For example, the cluster name is configured in this file.
coherence-cache-config.xml - this file is responsible for the configuration of the various caches that the cluster will serve.
coherence-pof-config.xml - this file is intended to determine which data will be processed by the cluster. Also, this file defines how the data will be serialized for transmission and storage in a cluster.

There are so-called overrirde files (for example, tangosol-coherence-override.xml). The settings in this file overwrite the base file settings. That is, if you have tangosol-coherence.xml and tangosol-coherence-override.xml in the classpath, then all installations will be loaded from the first file and overwritten by installations from the second.
You can have several identical files in the classpath, but only the first file found will be used. You can also install the necessary configuration files using the system (-D) settings.
When the cluster starts, it writes which files were used to configure the system. Something similar to the following will appear in the logs:
Loaded operational configuration from resource ...
Loaded operational overrides from resource ...
Loaded operational overrides from resource ...

Proxy (extend) nodes

Proxy (extend) nodes are nodes that are used to provide access to the cluster by clients. The configuration must be done both on the server side and on the client side. That is, if you have an application written on a .NET platform, then you will have to install a .NET client (coherence-net- <version> .zip) and provide a coherence-cache-config.xml, which will describe the details of the cluster, which needs to be connected. On the server side, you will need to provide a coherence-cache-config.xml file in which the <proxy-scheme> will be described, which will indicate the address and port on which you want to listen for incoming calls. Both on the client and on the server it is necessary to provide coherence-pof-config.xml, which describes the data formats for the message between the client and the server.
Proxy nodes should not be used for computing. If during debugging of an application your debugger stops on a proxy node, it means that the cluster configuration is usually not correctly performed.

Data nodes for storage (storage nodes)

These are nodes that have the -Dtangosol.coherence.distributed.localstorage = true environment variable set. By default, the node stores the data in the java heap, but you can also “dump” it to disk and load it as needed. On these nodes, you can perform calculations, but you need to understand that you need to produce as little garbage as possible during the calculation process so that the node does not “fall” due to lack of memory (OutOfMemory). If the node “falls” for any reason, the data from it will be copied to other nodes, thus the total capacity of the cluster will decrease. This can cause a cascade effect if there is not enough free space in the cluster, and then the entire cluster can “fall”, node by node. As a rule, important data has a second copy (which is specified in the configuration settings), so the loss of a single node is not critical.
Data that is an intermediate result and is easily calculated from the main data does not need a second copy. Data storage can be configured to have copies on another node, on another physical machine, or on another rack in another city. These are all configuration parameters and there is nothing to program here. Data storage options are quite flexible and allow you to configure the system for a specific task.

Computational nodes (application tier / storage disabled nodes)

These are nodes that have the environment variable -Dtangosol.coherence.distributed.localstorage = false set. These nodes are used to evenly distribute the calculations to the cluster. You can also cache intermediate calculations on these nodes. All the business logic that you want to implement for this application must be executed in a cluster on these nodes.

Let's look at how the process of forwarding a call through the node hierarchy is implemented. You have nodes for data storage, computational nodes and proxy nodes. On the proxy nodes, the data is not stored and caches are not configured. On compute nodes, you configure caches, but without the ability to store data in caches. You have data on the data storage nodes. On the client side, you should not use caches on which data is stored. That is, you do not perform calculations on the data itself directly from the client, but always use computational nodes to perform operations on the data. Thus, you isolate data from client applications, which gives you the opportunity in the future to change the storage architecture without changing the client. All nodes in the cluster "know" where and what cache is located. It turns out that if you send a task for execution to a cache configured for computing, it will be executed in the group of computation nodes using data from the nodes on which data is stored. It may not sound quite clear, but this is a separate topic for the article.

Data localization (data affinity)

In some cases it is useful to have the data grouped together according to some principle. For example, you can group data in such a way that nodes located on the same physical machine will have dependent data. In this case, you will not have network latency and calculations will be faster.

The mechanisms for submitting tasks for execution are as follows: EntryAggregator, EntryProcessor, InvocationService, MapListener, EventInterceptor

The aggregator (EntryAggregator) is a task that will be performed on copies of the data. That is, you will not be able to change the data in the cache from the aggregator. Work happens with read-only data. Typical tasks are the sum, minimum, maximum.
Processor (EntryProcessor) - this task, which involves changing the data within the cache. That is, if you want to change the data inside the cache, where the data is physically located, you need to use a processor for this. A nice feature of the processor is the lock on the data during processing. That is, if you have several operations that must be called sequentially, then you need to use a processor, since only one processor will work on this piece of data at a particular point in time.
InvocationService is a node-level task. In this case, you work roughly with Java processes, not data. Tasks of this type must implement Invocable, which in turn is Runnable.
MapListener - this task will be executed asynchronously, as a reaction to events at the cache level.
EventInterceptor - this task is similar to the previous one in the sense that it will be executed as a reaction to an event, but the difference is that the listener will be executed on all nodes on which the cache is configured, and the interceptor is only on nodes that have data for which the event is executed. An interceptor also has the ability to be called before or after an event.
A detailed explanation of how different types of tasks work is beyond the scope of this article.

POF (portable object format) serialization

All data in the cluster is stored in a byte array. The fields of the serialized object are stored sequentially (each field has its own index) and exactly as you write in the methods of the readExternal / writeExternal class that implements the PortableObject interface or the serialize / deserialize interface that implements the PofSerializer interface. Inside the byte array, the fields are stored sequentially. An array scan also occurs sequentially. This does not result in an obvious conclusion: the most used fields should be closer to the beginning of the byte array. Object data written to an array can be nested and have its own serialization. That is, when implementing the PortableObject and PofSerializer interfaces, you translate the hierarchical structure of a java object into a flat byte array structure.
Coherence provides serialization for standard objects from jdk (java.lang). All objects that should be saved in a cluster should be described in the file coherence-pof-config.xml. Each data type has its own number. The numbers of your data types should begin at c 1000. Thus, you get a structure that is well portable from one platform to another, and from one programming language to another. Each class that will be serialized in a cluster must have correctly implemented hashCode and equals methods.

Extracting data from a cluster (ValueExtractor)

From the previous paragraph, we know that the data is stored in a byte array. In order to extract data, you need to write a class that implements the ValueExtractor interface. Coherence will use this class to get the necessary part of the serialized object and present it as a class with which you can work. That is, you have the opportunity to “pull out” from the data not the entire object, but only what you need at the moment for calculations. Thus, you have reduced the amount of data sent over the network.

Partition (partition)

Coherence provides the ability to store data in the form of "key-value", but the key and value are logical concepts of the system. At the physical level, the data is grouped into a member. That is, several keys and values may belong to the same partition. Partition is a data storage unit. When nodes fall and data regroup between nodes, the partition is copied entirely. The class that assigns the partition to a specific object implements the KeyPartitioningStrategy interface. By default, the partition is assigned according to the hash code of the Binary object of the key (com.tangosol.util.Binary object "wraps" the byte array). You yourself can influence how the partition is assigned by providing your implementation of the KeyPartitioningStrategy interface.

Index

As in the database, the index in Coherence is used to speed up data retrieval. To create an index, use the QueryMap.addIndex method (ValueExtractor extractor, ordered boolean, java.util.Comparator comparator).
ValueExtractor is used to select from the byte array the necessary data for the index. When you call the addIndex method, this does not mean that the cluster will start indexing the data right now. This call is a recommendation to create an index when resources will allow it. After it is created, changes to the data will be displayed correctly in the index. This method can be called several times, and if the index already exists, it will not be re-created. The index is a node level structure. That is, when data is copied from one node to another, the index will not be copied; instead, it will be changed according to the data that is on this node. The data in the index is stored in deserialized form, so if you have the need to get the data quickly and without deserialization, create an index. Naturally, you have to “pay” for convenience and speed, and you pay free space in the cluster and computing resources. Inside the index consists of two sub-indices (direct and inverse). The direct index stores the data as key-> value (which the extractor provided), the inverse index stores the data as value-> multiple keys.

Caches: replicated, distributed, near

Replicated is a cache in which all data is stored in deserialized form on each of the nodes. This type of cache provides the fastest read operations, but slow write operations. The point is that in the case of a record, the data must be copied to all the nodes where this cache is configured. This type of cache is usually used for rarely changing small data.
Distributed is the main type of cache used to store data. It allows you to overcome the limitations on the amount of RAM allocated to a single node, as if "smearing" data across the entire cluster. This type of cache also provides horizontal scalability by including new nodes in the cluster, as well as fault tolerance by storing a copy of the data on other nodes.
Near is a hybrid type of cache that is configured on the caller (the caller can be either a client or another node within the cache). As a rule, this cache “stands” before the distributed cache, and stores the most frequently used data. Data is stored in deserialized form. In the case of near cache, there is a possibility that the data will become outdated, so you need to configure how the data will be updated.

Service

This is a group of java threads that are responsible for communicating with other nodes of the cluster, performing tasks sent for execution for stored data, copying data in case of failure of other nodes, and other data maintenance tasks. It sounds rather abstract, but this is exactly what allows you to easily work with data. The service can serve multiple caches. The only thing that is important to know and remember during the development process is that the service does not support reentry calls. For example, you sent the EntryProcessor to execute and from it make a call to the cache served by this service. You will immediately receive an InterruptedException.

Now that we have the basic building blocks of concepts, we can answer the question why Coherence is needed at all.
The answer is simple: you have a fault-tolerant, horizontally scalable data warehouse that provides quick access to data for parallel computing. Due to the presence of several nodes, you have no limit on the size of the data that you can store in a cluster (of course, you are limited by the amount of available memory on the physical machines allocated for this task). You have no limit on the size of a single key-value pair. You can also extract from the stored data only what you need for the calculations, so that the minimum information will be copied over the network. In general, the whole ideology of Coherence is built on sending only what is necessary over the network. Also, you can customize services and levels of calculations quite flexibly for your task. As a result, challenges will be solved quickly.
From a management point of view, you are buying a solution that will satisfy many requirements. Once you have loaded the data into the system, you can retrieve it in various ways and use it on other systems that use Coherence as a data store. Thus, by putting Coherence at the base, you can build an ecosystem for extracting and processing data.

If you have an interest in this topic, I can continue the series of articles on Coherence. Write what exactly you are interested in, and I will try to answer.
So far, in terms of:

configuring the basic cluster structure
work with EntryAggregator
work with EntryProcessor
asynchronous calls to coherence
look inside the binary object
work with indexes

In general, it should be noted that it is very easy to start with Coherence, but it is very difficult to do well and quickly, therefore the goal of the article cycle is to fill the gap between the initial level of familiarity with the system and the advanced developer level.

Recently, my colleague wrote a book for advanced developers, which I recommend reading. In this book you will not find basic knowledge, but you will find examples of solutions (with explanations) for quite complex tasks. Author: David Whitmarsh

Source: https://habr.com/ru/post/245057/

All Articles