Know how much a string takes in memory? I just haven't heard any answers to this question, starting from “I don't know” to “2 bytes * the number of characters in a string”. And how much is the empty string then? Do you know how much an object of class Integer takes? And how much will your own class object occupy with three Integer fields? It's funny, but not a single Java programmer I knew could answer these questions ... Yes, most of us don't need it at all and no one in real java projects will think about it. But this, in fact, how not to know the engine volume of the car you are driving. You can be a great driver and not even be aware of what the numbers 2.4 or 1.6 mean on your car. But I am sure that there are few people who are not familiar with the meaning of these figures. So why do java programmers know so little about this part of their tool?
Integer vs int
We all know that in java - everything is an object. Except, perhaps, primitives and references to the objects themselves. Let's look at two typical situations:
In these simple lines, the difference is huge, both for the JVM and for the OOP. In the first case, all we have is a 4-byte variable that contains the value from the stack. In the second case, we have a reference variable and the object itself, to which this variable refers. Therefore, if in the first case we are determined to know that the size occupied is:
sizeOf(int)
then in the second:
sizeOf(reference) + sizeOf(Integer)
Looking ahead, I will say - in the second case, the amount of memory consumed is approximately 5 times larger and depends on the JVM. And now let's see why the difference is so huge.
What does an object consist of?
Before determining the amount of memory consumed, you should understand what the JVM stores for each object:
- Title of the object;
- Memory for primitive types;
- Memory for reference types;
- Offset / alignment - in fact, it is somewhat unused bytes that are placed after the data of the object itself. This is done so that the address in memory is always a multiple machine word, to speed up reading from memory + reduce the number of bits for a pointer to an object + presumably to reduce memory fragmentation. It is also worth noting that in java the size of any object is a multiple of 8 bytes!
')
Object Header Structure
Each instance of the class contains a header. Each header for most JVMs (Hotspot, openJVM) consists of two machine words. If we are talking about a 32-bit system, then the size of the header is 8 bytes, if we are talking about a 64-bit system, then accordingly - 16 bytes. Each heading may contain the following information:
- Mark word - unfortunately I did not manage to find the purpose of this information, I suspect that this is just a part of the heading reserved for the future.
- Hash Code - each object has a hash code. By default, the result of calling Object.hashCode () returns the address of the object in memory, however some garbage collectors can move objects in memory, but the hash code always remains the same, since the place in the object header can be used to storing the original hash code value.
- Garbage Collection Information - each java object contains the information needed for the memory management system. This is often one or two flag bits, but it can also be, for example, some combination of bits to store the number of references to an object.
- Type Information Block Pointer - contains information about the type of object. This block includes information about the virtual method table, a pointer to an object that represents a type, and pointers to some additional structures for more efficient interface calls and dynamic type checking.
- Lock - each object contains information about the status of the lock. This can be a pointer to the lock object or a direct lock representation.
- Array Length - if the object is an array, then the header is extended by 4 bytes to store the length of the array.
Java specification
It is known that
primitive types in Java have a predefined size; this is required by the specification for code portability. Therefore, we will not dwell on the primitives, since everything is perfectly described by the link above. And what does the specification for objects say? Nothing, except that every object has a title. In other words, the size of your class instances may differ from one JVM to another. Actually, for simplicity, I will give examples on the 32-bit Oracle HotSpot JVM. And now let's look at the most used classes Integer and String.
Integer and string
So let's try to calculate how much an object of class Integer will occupy in our 32-bit HotSpot JVM. To do this, you will need to look into the class itself; we are interested in all fields that are not declared as static. From these we see only one thing - int value. Now, based on the information above, we get:
: 8 int: 4 8 : 4 : 16
Now let's take a look at the string class:
private final char value[]; private final int offset; private final int count; private int hash;
And calculate the size:
: 8 int: 4 * 3 == 12 : 4 : 24
Well, that's not all ... Since the string contains a link to an array of characters, then, in fact, we are dealing with two different objects - an object of class String and the array itself that stores the string. This, as it were, is true from the point of view of OOP, but if you look at it from the side of memory, then you need to add to the resulting size the size of the array allocated for the characters. And this is 12 more bytes for the array object itself + 2 bytes for each character of the string. Well and, of course, do not forget to add alignment for the multiplicity of 8 bytes. The result is ultimately a simple, seemingly new String (“a”) string that results in:
new String() : 8 int: 4 * 3 == 12 : 4 : 24 new char[1] : 8 + 4 == 12 char: 2 * 1 == 2 8 : 2 : 16 , new String("a") == 40
It is important to note that the new String (“a”) and the new String (“aa”) will occupy the same amount of memory. It is important to understand. A typical example of using this fact to your advantage is the hash field in the String class. If it were not, then the string object would somehow occupy 24 bytes, due to alignment. And so it turns out that for these 4 bytes there was a very worthy use. A brilliant decision, isn't it?
Link size
I would like to talk a little bit about reference variables. In principle, the size of the link in the JVM depends on its depth, I suspect that for optimization. Therefore, in 32-bit JVM, the size of the link is usually 4 bytes, and in 64-bit - 8 bytes. Although this condition is not necessary.
Field grouping
It should also be noted that the JVM performs a preliminary grouping of the object fields. This means that all class fields are placed in memory in a certain order, and not as declared. The order of grouping is as follows:
- 1. 8-byte types (double and long)
- 2. 4-byte types (int and float)
- 3. 2-byte types (short and char)
- 4. Single byte types (
boolean and byte) - 5. Reference Variables
Why all this?
Sometimes there is a situation in which you need to estimate the approximate amount of memory for storing certain objects, such as a dictionary, this little help will help you quickly navigate. Also, this is a potential optimization method, especially in an environment where access to its settings is not available.
findings
The memory topic in java is very interesting and extensive, when I started writing this article, I thought that I would fit in a couple of examples with conclusions. But the further and deeper you dig, the more and more interesting it becomes. In general, knowing how memory is allocated for objects is a very useful thing, as it will help you save memory, prevent
similar problems, or optimize your program in places where it seemed impossible. Of course, the places where you can use such optimizations are very rare, but still ... I hope the article was interesting for you.