Refinement of the vector container for working with large amounts of data

In the process of reading the article “Lightweight” implementation of the vector container, I remembered my experience in dealing with the vector. Actually, I would like to share this experience.

Yes, with an array dimension of 10,000 elements, the extra pointer in the implementation of the element will bring tangible inconveniences, but the real problem of memory when using vector is fully manifested somewhat differently.

Actually, the problem is to allocate memory for new items.
')
In the recent past I worked on a project in which I had to process large data arrays (about 20–30 thousand elements per array, ~ 24 bytes of static per element). It was very convenient to use for this vector. At once I will make a reservation that performance was not a bottleneck in the project.

Usually, no one thinks about what happens at the moment when push_back () is called. We also did not think about it until the memory limit was entered for our application. And here the rake struck the first blow. Namely - I had to pay attention to the fact that the graph of memory usage resembles a cardiogram. Those. in some seemingly random moments of time, there appear quite significant peaks for a very insignificant time and these peaks are out of limit. In addition, it was finally noticed that the application has a problem with spontaneous “hanging up” for a short time. The time was irrelevant, but it was still unclear why the calls to the same method take 1-2 milliseconds or 1-2 seconds.

The trial led us to the very push_back ().
During the trial, I learned (from a more savvy colleague) that the memory in the vector is allocated logarithmically, namely:
- if when adding an element there is no reserved memory for it, then size * K memory is reserved, where size is the current size of the vector and K is a factor depending on the implementation of the vector and usually it is 2 (you can read in more detail, for example, Alena Sagalayeva ).

Thus we get the following:
Suppose we already have 1024 elements in the vector. When push_back () is called, the size of the vector will be increased to 2048 (with K = 2, for simplicity, we assume that this is always the case), but only 1025 elements will actually be stored there. Of course, the excess reserved memory can be freed up, for example with the help of swap trick, but it seems more correct not to allocate too much, i.e. use reserve.

No sooner said than done. They added, reassembled and started testing and returned to the beginning because the cardiogram stubbornly continued to take place, as well as "hangs up".

In the end, after long reflections on the meaning of life, the principles of operation of the vector and the system of memory allocation, justice prevailed.

Let's return to our vector which already has 1024 elements.
For clarity, we assume that the size of the element is 16 bytes.
Then the amount of memory occupied by such a vector (without taking into account the overhead of the vector itself and the dynamic part of the elements) will be 16 kilobytes.
Since the vector is always located in continuous memory, when you try to add to it the ill-fated 1025th element, the following happens:

A new memory block of size * K is allocated, i.e. 32 kilobyte
The contents of the vector are copied to this new block.
old block is destroyed
our 1025th element is physically added to the vector

So the problem is that at some point in time the amount of memory used is equal to size + size * K, i.e. and the old block and the new - 48 kilobytes in our case.
On small lists, this does not matter, but imagine a similar operation with a list of several tens of thousands of items. Using reserve does not save the situation because the picture is almost the same - the old block + new, a little larger than the old one, copying, deleting - 32 kilobytes.
It also turned out that this was also the cause of "hang-ups" - allocating large amounts of memory and then copying the data requires remarkable effort.

You can talk for a long time on the optimization of the elements themselves and various spherical horses in a vacuum, but in our case, the solution was the implementation of MultyVector (as we called it), a class that repeats the necessary interface of the original vector that stores data in a vector of vectors, dynamic expansion / cleaning.

I think that it would be superfluous to describe all the insides of this uncomplicated wrapper and bring in any code.
Let me just say that in our case the normal size of the internal vector was 1000 elements, and deleting / adding elements to an arbitrary point of the array was so rarely used that they could be neglected and not be afraid that one of the internal vectors would grow to incredible sizes again.

Thus, it was possible to replace the work with a large memory block with a set of smaller ones (several vectors of 1000 elements each).

The approach does not claim to be the ideal, but coped with the task at 100%.

UPD. I, in fact, do not quite understand the reason for the hollywood in comments, therefore:
1. The specifics of the project did not allow the use of anything third-party. All the most necessary was provided by the framework. Among the suitable necessities was only a vector and a leaf. The vector was more convenient because of the fact that it was necessary to work with such volumes of data learned only by the end of the project. For the same reason, at first did not think about memory.
2. As for spherical horses in a vacuum and idiots in comments, I already wrote above.

Source: https://habr.com/ru/post/113324/

All Articles

Refinement of the vector container for working with large amounts of data

More articles: