Ladies and gentlemen, I want to share with you a notable way of shooting myself in the leg, with which I knocked one limb to myself, although I thought of myself as an expert in the concurrency library. But such a simple thing as ThreadLocal let me down, unexpectedly completely swallowing a couple of extra gigabytes of server memory.
Of course, the memory of your servers can be better used than storing garbage. So do not repeat my mistake. Namely: you should not try to store in ThreadLocal links to this ThreadLocal, or to some graph of objects, eventually referring to this ThreadLocal.
First, I’ll give a piece of code
')
class X { ThreadLocal<Anchor> local = new ThreadLocal<Anchor>(); class Anchor { byte[] data = new byte[1024 * 1024]; } public Anchor getOrCreate() { Anchor res = local.get(); if (res == null) { res = new Anchor(); local.set(res); } return res; } public static void doLeakOneMoreInstance() { new X().getOrCreate(); } public static void main(String[] args) throws Exception { while (true) { doLeakOneMoreInstance(); System.out.println(Runtime.getRuntime().freeMemory() / 1024 / 1024 + " MB of heap left"); } } }
Each time doLeakOneMoreInstance is called, a new instance of X is created, it calls a method that exposes the value of ThreadLocal, and then the link to X is irretrievably lost. The link to the ThreadLocal instance created in the constructor never goes beyond X. It would seem that after this, the entire created graph of objects of external links is not and cannot be, and they can be safely removed by the GC.
But it was not there. It is worth running this code with some small heap size limit, as the JVM will fall, leaving behind only the message “java.lang.OutOfMemoryError: Java heap space”, crowning the stackrace (however, the given class is so voracious as a couple of gigabytes to it enough for only a couple of milliseconds).
Try, before reading further, as a self-test to answer the question:
how to get rid of OOM, having added only one key word in the given fragment?
Of course, in such a synthetic example it is easy to guess that ThreadLocal is to blame for everything (since apart from him there is nothing special and not), however, if this happens in a large project where there are millions of X, alive and dead, then the problem will not be identified simply. Maybe for some people the decision is obvious, but personally, this cost me more than one hour of life.
What is the problem?
(Everything described below is valid for the implementation of Oracle's JVM. However, others may also be subject to the problem.)
To answer this question, you need to go deep into the depths of ThreadLocal. The fact is that the data of ThreadLocal-variables are not stored in them, but directly in the Thread objects. Each Thread has its own copy of the dictionary with “weak” keys (analogous to WeakHashMap), where the ThreadLocal instances act as keys. When you ask the ThreadLocal variable to give its value, it actually gets the current thread, extracts the dictionary from it, and gets the value from the dictionary, using itself as a favorite key.
If there are no links to ThreadLocal, then the link used in the dictionary as a key will be safely nullified, and when inserting new items, there is a cleanup of entries that refer to deleted GC objects.
In this mechanism lies the problem: the dictionary inside the stream contains weak references to the keys,
but the values ​​store direct links ! If in some way from within the ThreadLocal value (in the example, an object of type Anchor), the ThreadLocal containing it is reachable (in the example, since Anchor is a non-static class, it implicitly contains a reference to an object of type X, which in turn refers to ThreadLocal ), the GC will not be able to remove ThreadLocal normally, and it remains to hang a dead weight until the end of the centuries, or rather, as long as the owner-thread is alive.
Well, the answer to the self-test question is now quite trivial:
to avoid a memory leak, it is enough to add the static keyword to the Anchor class, thereby opening the vicious circle of links.
It must be said that from the described features of ThreadLocal the legs grow at one more trouble: as long as the flow to which the value belongs is still alive, no one guarantees the removal of its associated ThreadLocal value, even if the link to ThreadLocal is lost: the thing is that the old values ​​are cleared only when the ThreadLocal values ​​associated with this thread are accessed, and if the thread is waiting for network input / output, is sleeping or performing any other long-running operation, the wait can be delayed indefinitely.
Be careful with ThreadLocal, colleagues! Do not put links to ThreadLocal in them, do not store in them petabytes of data. Sometimes it is easier and safer to use Map <Thread, Value> than to monitor the correct use of ThreadLocal - in this case, you at least control the life cycle of your objects.
PS Yes, and I deliberately called the article “Memory Leak With ThreadLocal”, and not “Memory Leak In ThreadLocal”: in my opinion the mistake is in the approach to using this tool, the standard library itself works flawlessly.