Matt Williams' informative article on Docker and existing memory limitations was recently published. The author raises an interesting topic about the hidden problem of memory constraints that users may encounter while working with containers.
A large number of reposts and likes shows that this topic is quite popular among Java developers.

Therefore, I would like to analyze this problem in more detail and determine possible ways to solve it.
')
Problem
Matt describes his nightly “journey” in a Docker container with standard JVM memory behavior. He found that the
RAM limits are displayed incorrectly inside the container. As a result, a Java application, or any other, sees the total amount of RAM resources allocated to the entire host machine, and the JVM cannot specify how many resources have been provided to the parent container for work. This results in an OutOfMemoryError error caused by the incorrect behavior of the JVM heap in the container.
Fabio Kung, from Heroku, described in detail the main causes of this problem in his recent article "
Memory inside Linux containers. Or why does free and top not work in a Linux container? "
Most Linux tools that provide system resource metrics were created at the time when cgroups did not exist (for example: free and top, like procps). They usually read memory metrics from the proc file system: / proc / meminfo, / proc / vmstat, / proc / PID / smaps, and others.
Unfortunately, / proc / meminfo, / proc / vmstat, etc. are not in containers. This means that they are not managed by cgroup. They always display the amount of memory of the host system (physical or virtual machine) as a whole, which is useless for modern Linux containers (Heroku, Docker, etc.). The processes inside the container that are needed to determine the amount of memory they need to work cannot rely on free, top, etc .; they are subject to restrictions imposed by cgroups and cannot use all the available memory of the host system.
The author emphasizes the importance of the visibility of the limits of real memory. This allows you to optimize application performance and fix problems inside containers: memory leaks, swapping, performance degradation, etc. In addition, in some cases, rely on vertical scaling to optimize the use of resources within containers by automatically changing the number of working applications, processes or threads. Vertical scaling usually depends on the amount of memory available in a particular container, so restrictions should be visible inside the container.
Decision
The Open Containers community is initiating work to
improve runC to replace the
/ proc files. LXC also creates
the lxcfs file system , which allows containers to have virtualized cgroup filesystems and a virtualized view of
/ proc files. So this question is under the scrutiny of the system administrators of the container. I believe that the improvements mentioned can help solve this problem at a basic level.
We also encountered the same problem in Jelastic and have already found ways to solve it for our users. Therefore, we would like to tell the details of the implementation.
First of all, let's go back to the Jelastic installation wizard,
select the service provider for the test account and
create a Java Docker container with predetermined memory limits - for example, 8 keys, which are equivalent to 1 GB of RAM.

Go to
Jelastic SSH gate (1), select the previously created test environment (2), and select the container (3). From the inside, you can check the available memory with the
free (4) tool.

As we can see, the memory limit is 1 GB, as previously defined. Now check the
top tool.

Everything works properly. For a double check, we will repeat Matt’s test related to the question of the heuristic behavior of Java, described in his article.

As expected, we get
MaxHeapSize = 268435546 (~ 256 MB) , which is 1/4 of the container's RAM in accordance with the standard behavior of Java dynamic memory.
What is the secret of our decision? Of course, in the right combination of "ingredients". In our case, this is a combination of OpenVZ and Docker technologies, which gives more control in terms of security and isolation, as well as the ability to use functions such as
live migration and container hibernation. Below is a high-level diagram of the Docker container in Jelastic.

In OpenVZ, each container has a virtualized view of the pseudo-file system
/ proc . In particular,
/ proc / meminfo inside the container is a “special” version that displays information about each container, not the host. Therefore, when tools such as
top and
free work inside a container, they show the RAM and the use of a swap with restrictions specific to that particular container.
It should be noted that the swap inside the containers is not real, but virtual (hence the name of the whole technology -
VSwap ). The basic idea is that when a container with VSwap activated exceeds the specified RAM limit, some of its memory goes into the so-called swap cache. There is no real pumping, which means that there is no need for input / output, unless, of course, there is a shortage of global RAM. In addition, a container that uses VSwap, and having exceeded the limit of RAM, is “punished” by slowing down; from the inside, it looks like a real swap occurs. This technology leads to container memory control and swap usage.
Such an implementation allows you to run Java and other systems without the need to adapt applications under
Jelastic PaaS . But if you do not use Jelastic, a possible workaround will indicate the size of the dynamic memory for the Java virtual machine and not depend on heuristics (according to
Matt’s advice ). For other languages, more research is needed. Please contact us if you can share your experience in this direction, and we will be happy to expand this article.