📜 ⬆️ ⬇️

Why the buffer should grow exponentially

Mozilla employee Nicholas Netherkot published a note with a very clear explanation of why the program's memory buffer size should be increased exponentially rather than linearly.

Suppose we have a data structure that requires more and more memory, for example, a string or a vector. If new elements are not placed in the buffer, then a new buffer is created, all contents from the old one are copied, and then the old buffer is freed. Normal realloc() does this.

So here. Imagine that our initial 1-byte buffer grows by 1 byte until it reaches the size of 1 MiB. How much memory have we used cumulatively for it?
')
  1 + 2 + 3 + ... + 1,048,575 + 1,048,576 = 549,756,338,176 bytes 

Not weak, yes?

Experienced programmers have known this nuance for a long time, so someone even considers a linear increase in the buffer as a “typical beginner's mistake” .

The bottom line is that if a new array is always greater than the previous one by a constant value, then the total work will be O (n 2 ). If the new array is increased by a constant factor of N, then the total work will be O (n). By “work” we mean calls to the functions malloc() and realloc() , followed by redistribution and copying of values ​​in memory, that is, the total load on the system.

Of course, there are situations, for example, with firmware for hardware that can be considered as exceptions to the rule. But for desktop systems, this is practically the law. The only question is which factor to use: 2 or 1.5 .

Nicholas Netherkot adds to his example that in practice the total memory consumption will, of course, be less, because the memory manager rounds off the buffer size. For example, the manager of the same Firefox (an old, highly modified version of jemalloc) rounds requests for memory allocation between 4 KiB and 1 MiB to the nearest multiplier 4 KiB. That is, in our example, for a buffer with a total volume of 1 MiB, we get only:

  4,096 + 8,192 + 12,288 + ... + 1,044,480 + 1,048,576 = 134,742,016 bytes 

For comparison, if we immediately used the factor 2:

  4,096 + 8,192 + 16,384 + 32,768 + 65,536 + 131,072 + 262,144 + 524,288 + 1,048,576 = 2,093,056 bytes 

They say that the “typical novice error” manifests itself in very many projects, when any I / O task does not scale the buffer size. Netkotk immediately found the corresponding fragments of non-optimal code in the implementations of algorithms in XDRBuffer , nsTArray and SQLite .

Source: https://habr.com/ru/post/242279/


All Articles