I read a very, very interesting article "Memory Management Strategies for Erlang VM." It was written as a dissertation by Jesper Wilhelmson. I thought it would be a good idea to discuss the differences between memory management in Erlang and a Java VM from Oracle.
A very brief introduction for those who have never heard of Erlang: this is a functional language that uses asynchronous messaging as a basis for concurrency. Copying semantics is used to transmit messages, which makes it essentially transparent for a programmer to distribute multiple Erlang VMs running on more than one machine.
Erlang and Java are similar in the sense that both use a virtual machine to abstract from the hardware layer. Both languages ​​use machine-independent bytecode. Both are runtime systems with “garbage collection,” freeing programmers from manual memory management.
The overhead (in the original, overhead) for streams in Erlang is very low. It seems to me that the Erlang stream requires about 512 bytes. For Java streams, as a rule, you need about 512 kilobytes, which is about 1000 times more. Systems with multiple threads that work asynchronously should be well thought out by the programmer. Typical Erlang systems hold thousands or tens of thousands of threads. At the same time, there is no fooling around with threadpool and executor, which we do in Java.
')
Having played a little with Erlang, I found that it is a rather pleasant compromise between a functional language and a language that allows you to write real applications. (I know that I will face sharp criticism for this phrase). Reliable distributed error handling is a pleasant surprise, and writing a network server of any kind is actually very easy. Automata approach to web servers makes rollback on errors quite natural.
But this post is not about the programming model in Erlang. It is about how Erlang VM works with memory.
The Java Virtual Machine uses what Erlang the programmer would call the general heap topology. There is one big pile that is used by all threads. Most of the memory is allocated in this heap. In addition to the heap, the JVM uses some specialized data areas, such as the code cache and the permanent generation. They are also divided between all streams.
In contrast, Erlang uses private heap topologies. Each thread has its own tiny heap that contains all the data used by the thread and its stack. All stream data is in its local heap. It is reserved when a stream is created. When a thread is destroyed, the entire heap simply returns to the free memory pool.
In addition to private heaps, the so-called Binary heap and Heap of messages are available to all threads. These are specialized heaps. Binary heaps are needed to allocate large areas of arbitrary data that can be divided between streams, for example, input files or network buffers.
The message heap is for data used in messages. Messages are also shared between processes. Messages are transmitted between threads by copying a pointer from the sending stream to the destination thread. The data messages are stored in a heap of messages.
I was impressed with the Erlang memory model. It seems to me much more scalable than the single Java heap model. The semantics of the language and the memory model of Erlang are perfectly combined.
For example, the simple fact that heaps are closed eliminates the appearance of interlocks, and, therefore, from checking for them.
The latest version of Erlang VM takes another step forward - the ability to have more than one scheduler. One scheduler per physical processor, to be exact. It also eliminates checks for a whole class of locks. When the scheduler is idle, it may receive some processes from another scheduler.
Java has a lot to learn from Erlang. However, there are a few good things in Java that I miss when working with large Erlang systems.
Erlang VM will redistribute heaps when a stream accumulates a large amount of data. However, for some reason, with the effect of the redistribution algorithm, the heap sizes quickly grow. With a high load, I saw that the Erlang VM eats up 16 GB of RAM in a matter of minutes. Each release must be thoroughly tested in load testing, so that it has adequate memory requirements.
While there are no mechanisms in Erlang VM, allowing to curb memory growth. A virtual machine happily allocates so much memory that the system runs into a swap and exhausts all virtual memory. This can cause the machine to “hang” even when accessed via the KVM console. In the past, we had to reboot machines in order to re-access them.
The Erlang programming model, based on queues, allows you to write code with great pleasure, but on the other hand it is the Achilles' heel in production. Each queue in Erlang is unlimited. The virtual machine will not throw exceptions or limit the number of messages in the queue. Sometimes a process stops processing messages due to an error, or the process simply cannot keep up with the stream of messages sent to it. In this case, Erlang simply allows the message queue for this process to grow until the VM is killed or the machine is blocked, which happens before.
This means that when you run “large” Erlang VM in a production environment, you need to check at the operating system level, which will kill the process if too much memory is used.
Thus, I believe that the Erlang private heap memory model can be a very powerful tool. It eliminates the whole classes of locking mechanisms at runtime and that means it will scale better than Java. On the other hand, hard restrictions on Java memory give a win when your system is loaded or under DDoS.
Well, and lastly:
There are command line options for Erlang VM that allow you to switch from using the private heap topology to using the common heap topology.
I like Erlang and Java. They are difficult to compare, because there is too little in common for the developer. In general, I would like to use Java for most systems. She has better support for various tools and the number of libraries available is staggering. I choose Erlang when I need a flow-oriented messaging system. This is where the Erlang programming model really turns out to be great.
References:
JCG partner at Jan -MonitorHappy coding! Do not forget to share!
Byron
= - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - = - =
There are interesting (and disproving) comments on the link to the original, and those interested in reading must read!