folly :: fbvector - improved std :: vector from Facebook

Folly is an open C ++ library developed by Facebook and used by it in internal projects. In order to optimize memory and processor resources, the library includes its own implementations of some standard containers and algorithms. One of them is folly :: fbvector - replacing the standard vector (std :: vector). The Facebook implementation is fully compatible with the original std :: vector interface, the changes are always non-negative, almost always measurable, often significant, and sometimes even enormously affect performance and / or memory consumption. Just include the folly / FBVector.h header file and replace std :: vector with folly :: fbvector to use it in your code.

Example

folly::fbvector<int> numbers({0, 1, 2, 3}); numbers.reserve(10); for (int i = 4; i < 10; i++) { numbers.push_back(i * 2); } assert(numbers[6] == 12);

Motivation

std :: vector is a well-established abstraction that many use for dynamically allocated arrays in C ++. It is also the most famous and most used container. The big surprise is that its standard implementation leaves quite a lot of opportunities to improve the efficiency of vector use. This document explains how the folly :: fbvector implementation improves on some aspects of std :: vector. You can use the tests from folly / test / FBVectorTest.cpp to compare the performance of std :: vector and folly :: fbvector.

Memory management

It is known that std :: vector grows exponentially (with a constant growth rate). An important nuance in the correct choice of this constant. Too small a number will result in frequent reallocations, too large a number will consume more memory than is really necessary. The initial implementation of Stepanova used the constant 2 (i.e., each time the push_back call required an expansion of the vector, its capacity doubled).
')
Over time, some compilers have reduced this constant to 1.5, but gcc stubbornly uses 2. In fact, you can mathematically prove that constant 2 is strictly the worst of all possible values, because it never allows you to reuse the previously allocated memory.

To understand why this is so, imagine a vector of size n bytes, already completely filled, into which one more element is being tried. This leads, according to the implementation of std :: vector, to the following steps:

Memory allocated, the size of 2 * n bytes. Most likely, it will be allocated in the address space for the current vector (maybe directly beyond, maybe after some interval).
The old vector is copied to the beginning of the new one.
A new element is added to the new vector.
The old vector is removed.

When it comes to increasing this new vector - it will be increased to 4 * n, then - to 8 * n, etc. Each new memory allocation will require more of it than for all previous vectors combined, because at each step k we need (2 ^ k) * n bytes of memory, and this is always more than the sum (2 ^ (k - 1)) * n + (2 ^ (k - 2)) * n ... + ((2 ^ 0)) * n

Those. the memory manager will never be able to reuse the previously allocated memory for the vector and will always be forced to allocate a “new” memory. This is hardly what we really want. So, any number less than 2 guarantees that at some step of increasing the capacity of the vector, we can not allocate memory in the new address space, but re-use the memory previously allocated to this vector.

The graph shows that with a constant increase in capacity equal to 1.5 (blue line), the memory can be reused after the 4th step of increasing the vector, at 1.45 (red line) - after the third, and at 1.3 (black line) - after the second.

Of course, the graph above makes several simplifications about how memory allocators work in the real world, but the fact that the gcc constant 2 chosen is theoretically the worst case does not change. fbvector uses the constant 1.5. And it does not hit on the performance of small-sized vectors, due to the way fbvector interacts with jemalloc.

Interaction with jemalloc

Virtually all modern memory allocators allocate memory with certain quanta chosen to minimize the overhead of memory management and at the same time give good performance on the small sizes of allocated blocks. For example, the allocator can choose blocks like this: 32, 64, 128, ... 4096, then in proportion to the page size up to 1 MB, then increments of 512 KB, and so on.

As discussed above, std :: vector also requires quantized memory. The size of the next quantum is determined on the basis of the current capacity of the vector and the growth constant. Thus, at almost every moment in time, we have a certain amount of unused memory at the current time at the end of the vector, and right after it is a certain amount of unused memory at the end of the memory block allocated by the allocator. The conclusion itself suggests that the union of the vector allocator and shared memory allocator can provide optimization of the RAM consumption - the vector can ask the allocator for an “ideal” size block, eliminating unused memory in the block allocated by the allocator. Well, in general, the general strategy of allocating memory with a vector and an allocator can be sharpened better if you know the exact implementation of both. This is exactly what fbvector does - it automatically detects the use of jemalloc and adjusts to its memory allocation algorithms.

Some memory allocators do not support in-place reallocation (although most modern ones still support). This is the result of the not too successful design of the realloc () function, which itself decides whether to reuse its previously allocated memory, or allocate a new one, copy the data there and release the old one. This lack of memory allocation controls affects the new operator in C ++ and the behavior of std: allocator. This is a serious blow to performance, since in-place reallocation, being very cheap, leads to a significantly less aggressive strategy of memory consumption growth.

Reallocation of objects

One of the important issues of placing C ++ objects in memory is that, by default, they are considered to be unmovable. A relocatable object is considered to be a type that can be copied to another place in memory by bit-wise copying and at the same time, the object retains its validity and integrity in a new place. For example, int32 is relocatable, because 4 bytes completely describe the state of a 32-bit signed number, if you copy these 4 bytes to another place in memory - this block of memory can be completely unambiguously interpreted as the same number.

The assumption of C ++ that objects are unmovable harms everyone and does this only for the sake of a few controversial points of architecture. Moving an object involves allocating memory for the new object, calling its copy constructor, and destroying the old object. This is not very pleasant and generally against common sense. Imagine the theoretical conversation of Captain Picard and the Skeptical Alien:

Skeptical Alien : So, this teleport of yours - how does it work?
Picard : He takes a man, dematerializes him in one place and materializes him in another
Skeptical Alien : Hmm ... is it safe?
Picard : Yes, the truth in the first models were minor flaws. At first, a person was simply cloned, created in a new place. After that, the teleport operator had to manually shoot the original. Ask O'Brian, he worked as an intern just in this position. The implementation was not very elegant.

( comment of the translator : apparently, this was written before the introduction of motion constructors in the latest C ++ standards).

In fact, only a part of the objects are really unmovable:

Those that have pointers to external objects
Pointers from other objects

Objects of the first type can always be remade with minimal cost. Objects of the second type should not be created by the operator new and deleted by the operator delete - they must be managed by smart pointers and there will be no problems.

Movable objects are very interesting for std :: vector, because knowing how to move objects inside a vector significantly affects the efficiency of reallocating the vector of these objects. Instead of the above described Picard move cycle, objects can be moved with simple memcpy or memmove. For example, working with types like vector <vector> or vector <hash_map <int, string >> can be much faster.

In order to support the safe movement of objects, fbvector uses the type (trait) of folly :: IsRelocatable, defined in “folly / Traits.h”. By default, folly :: IsRelocatable :: value conservatively returns false. If you know for sure that your Widget type is relocatable, you can immediately add this type of declaration after this type of declaration:

 namespace folly { struct IsRelocatable<Widget> : boost::true_type {}; }

If you do not do this, fbvector will not compile with BOOST_STATIC_ASSERT.

Additional restrictions

Some improvements are also possible for “simple” types (more precisely, for those that have a trivial assignment operation) or for those types that have a constructor that does not throw exceptions (nothrow default constructor). Fortunately, these types are already present in the C ++ standard (well, or in Boost). To summarize, to work with fbvector, the Widget type must satisfy the following condition:

 BOOST_STATIC_ASSERT( IsRelocatable::value && (boost::has_trivial_assign::value || boost::has_nothrow_constructor::value));

These types are in fact very close - it is quite difficult to construct a class that will satisfy one part of the condition and break the other (unless it is specifically trying very hard). fbvector uses these simple conditions to minimize copy operations for basic vector operations (push_back, insert, resize).

To simplify type checking for compatibility with fbvector, Traits.h has the FOLLY_ASSUME_FBVECTOR_COMPATIBLE macro.

miscellanea

The fbvector implementation focuses on memory efficiency. In the future, the implementation of copying may be improved due to the fact that memcpy is not an intrinsic function in gcc and does not work perfectly for large data blocks. Improvement is also possible in the area of interaction with jemalloc.

Good luck using fbvector.

Source: https://habr.com/ru/post/238131/

All Articles