I just finished a series of changes in the code of the Chrome browser, which reduced the size of its Windows binaries by about 1 megabyte, transferred about 500 KB from the read / write segment to read-only, and also reduced the memory consumption in general by about 200 KB per every chrome process. The amazing thing is that this particular series of changes consisted solely of deleting and adding the keyword const in some places in the code. Yes, compilers are weird.
This problem arose when I
wrote documentation for some utilities that I use to study code regressions related to increasing the size of compiled binaries for Windows. I ran the utility, copied its output into the documentation, and started describing it when I noticed something strange: several large global objects that, according to the architecture, were supposed to be constant, were for some reason in the read / write segment of the data. The abbreviated version of that utility output is shown below:

')
Most executable formats have at least two data segments - one for read / write objects and one for read-only. If you have constant data, such as, for example,
kBrotliDictionary , then it will be logical to put them in the read-only segment, which is the “2” segment in the Chrome binary under Windows. However, some constant data, such as
unigram_table ,
device :: UsbIds :: vendors_ and
blink :: serializedCharacterData, were in section “3”, that is, in the read / write segment.

The location of the data in the read-only segment provides several advantages. This protects data from accidental damage, and also allows you to use it more efficiently. read-only pages are guaranteed to be shared by all processes that load this DLL (and in the case of Chrome, we always have several processes). In addition, in some cases (although probably not in these), the compiler can use constants directly in the code.
Pages in the read / write segment can also be shared, but this is not guaranteed. They are all created by default with the flag "
make a copy if necessary changes ", which means their general use only until the first write operation, which will lead to copying the page into the personal memory of the process. Thus, if the global variable is initialized at runtime, this will automatically make it inaccessible for general use by all processes. In addition, even if the global variable is only on the same memory page as the other copied during the recording - this also makes it inaccessible for general use - everything, you know, happens with the page size granularity (4 KiB).
Private data uses more memory, because they require a separate copy in each process. In addition, they are more expensive because they require space for a swap (which is not necessary for constant data, because you can read them if necessary from the image of the process binary). This leads to expensive write / read operations on the HDD, which, moreover, will generally be at random addresses (even slower).
For all these reasons, read-only pages are much more preferable in all cases where this is possible.
Add const is good
Thus, when my
ShowGlobals utility showed that blink :: serializedCharacterData was in the read / write data segment, and further research confirmed that this array never changes, I
added a const modifier to its declaration, which logically transferred it to the read segment -only data. Very simple. Such changes are always a good idea, but it is not always easy to understand exactly how much. Since we never change this array, it is quite possible that it will be created in memory only in one instance and used by all Chrome processes. But it is more likely that its end will fall on one page with another object, which, possibly, will change and thus lead to the creation of a copy of the memory page (together with a copy of the tail of our array). Thus, we will lose 7748 or 3652 bytes (the size of the array minus one or two pages of memory in the middle, which are guaranteed to be common). Such changes will help (well, or at least do not interfere) on all platforms, with all compilers.
Explicit declaration of your constant array with the const modifier is a good idea; you should do this. But the information above alone will not be enough to understand the whole picture. And here we are entering uncharted territory ...
Sometimes removing const is even better.
The next array I investigated was
unigram_table . This was a strange case, since it was initialized exclusively by constant data using the structure / array initialization syntax and was marked with the const modifier - but for some reason it was in the read / write data segment. It looked like a fad of the VC ++ compiler from all sides, so I used
my own instructions on minimizing the code necessary for reproducing a bug and sent the bug report to Microsoft. I copied the types and array declaration into a separate project and continued to reduce it, checking the location of the array in the read / write data segment at every step. In the end, I reached a minimalistic code that would fit in a
tweet :
const struct T {const int b[999]; } a[] = {{{}}}; int main() {return(size_t)a;}
If you compile this code and run
ShowGlobals on the received PDB, the utility will show that “a” is in section “3”, despite the announcement with the const modifier. Here are the specific steps for building and testing code:
> “%VS140COMNTOOLS%..\..\VC\vcvarsall.bat” > cl /Zi constbug.cpp /out:constbug.exe > ShowGlobals.exe constbug.pdb Size Section Symbol name 3996 3 a
After reducing my example to less than 140 characters, it became very easy to find the cause. With VC ++ compilers (2010, 2015, 2017 RC), it turns out that if you have a class / structure with a constant data member, then any global object of this type will end up in the read / write data segment. Jonathan Caves explained in his comments to my
bug report that this is because the type is received by the compiler-generated remote default constructor (it makes sense), which confuses the VC ++ compiler, which mistakenly defines this class as requiring dynamic initialization.
Thus, the problem in this case is in the const modifier, which stands next to the data member "b". As soon as I deleted this const, the entire array got into the read-only memory (quite ironic, right?). Since the whole object is somehow constant, deleting the const modifier from one of its data members does not reduce security at all, and in fact for the VC ++ compiler it increases it.
I expect that the VC ++ development team will fix this bug for VS 2017 - in this case, the code could not be fixed - but I don’t want to wait that long. And I began to remove const modifiers in the places where it caused similar problems. The process was rather trivial - I just continued to browse the list of global variables in the read / write data segment and assign them to one of the following categories:
- Those whose values ​​change - leave as is
- Do not change and have no const modifier - add it
- Do not change and have a problematic data member with a const modifier - remove it
It was really funny.
So I walked along the Chrome code, adding and removing const in the appropriate places. In most cases, my changes, as planned, led to the movement of data from the read / write segment to the read-only segment. But in two cases, these changes also did something else - reduced the size of the .text and .reloc sections. It was just fine, too good to be true. I assume that VC ++ generated code to initialize some of these arrays — and quite a lot of code.
The most interesting change was the removal of the three const from the definition of the UnigramEntry structure. This transferred a segment of 53064 bytes to read-only, and also reduced the size of chrome.dll and chrome_child.dll by 364500 bytes. From this it follows that the VC ++ compiler silently created initialization code, which occupied 7 bytes to initialize each byte of unigram_table. This simply could not be. It was too far beyond my expectations, so I ran Chrome under the Visual Studio debugger and set a breakpoint to change the data in at the end of the unigram_table array. Visual Studio predictably stopped program execution in the initializer. Below I will give (a little cleaned up) assembly initializer code (I replaced “unigram_table” with “u” to increase readability):
55 push ebp 8B EC mov ebp,esp 83 25 78 91 43 12 00 and dword [u],0 83 25 7C 91 43 12 00 and dword [u+4],0 83 25 80 91 43 12 00 and dword [u+8],0 83 25 84 91 43 12 00 and dword [u+0Ch],0 C6 05 88 91 43 12 4D mov byte [u+10h],4Dh C6 05 89 91 43 12 CF mov byte [u+11h],0CFh C6 05 8A 91 43 12 1D mov byte [u+12h],1Dh C6 05 8B 91 43 12 1B mov byte [u+13h],1Bh C7 05 8C 91 43 12 FF 00 00 00 mov dword [u+14h],0FFh C6 05 90 91 43 12 00 mov byte [u+18h],0 C6 05 91 91 43 12 00 mov byte [u+19h],0 C6 05 92 91 43 12 00 mov byte [u+1Ah],0 C6 05 93 91 43 12 00 mov byte [u+1Bh],0 … 52,040 lines deleted… c6 05 02 6e 0b 12 6c mov byte [u+cf42h],6Ch c6 05 03 6e 0b 12 6e mov byte [u+cf43h],6Eh c6 05 04 6e 0b 12 a2 mov byte [u+cf44h],0A2h c6 05 05 6e 0b 12 c2 mov byte [u+cf45h],0C2h c6 05 06 6e 0b 12 80 mov byte [u+cf46h],80h c6 05 07 6e 0b 12 c4 mov byte [u+cf47h],0C4h 5d pop ebp c3 ret
The numbers in the hexadecimal number system on the left are machine instruction codes, and the text on the right is their assembler representation. After some prologue, we see the code filling the array ... one byte ... using 7 instructions. Well, that explains it.
It is well known that modern optimizing compilers can generate code that is as good as written by man (and more often even better). And yet - sometimes they do not write such code. There are many things in this particular function that could be made better:
- She could not exist at all. The array is initialized with simple syntax for initializing arrays in C, and if it were not for the above-described bug in the VC ++ compiler, the initializer code would not need to be generated at all (as happens on other platforms).
- Write zeros could be skipped. This array is a global variable that is initialized only once when the program is started, and at this moment all memory is guaranteed to be filled with zeros, so writing zeroes on top of zeros is pointless work.
- Data could be written 4 bytes at a time, rather than one by one.
- The address of the array could be loaded into the register and used from there, instead of specifying it in each instruction. This would make the instructions smaller, and also save 2 bytes on the relocation instruction of the data found in the .reloc segment.
Well, in general, you understand the point. This function could be 4 times less, and also completely absent. It disappeared after removing the three const modifiers (changes are already available in
Chrome Canary ), and along with it, the extra ~ 364500 bytes of code and ~ 105000 bytes in the .reloc section disappeared, and this happened in both chrome.dll and chrome_child. dll. The array used to be in
.BSS (part of the read / write segment initialized with zeros), where it did not occupy any disk space, but moved to the read-only segment, where it began to occupy 53064 bytes, therefore the total saving of disk space was 416000 bytes Dll.
And, more importantly, most of the global variables affected by
these changes passed from the private memory of each process to shared shared memory, which resulted in savings of about 200 KB of RAM per process.
Examples of changes
I started with the largest and most frequently used objects and types in order to get a good and immediately visible result. I quickly reduced the size of the read / write segment by about 250 KB, moving about 1500 global variables to the read-only segment. You know, this is a matter of delay (what? Who has obsessive-compulsive disorder here? I have? I have no idea what you are talking about). But I managed to stop at some stage, although I know for sure that there are still hundreds of smaller global variables in the code that could be corrected in a similar way. At some point, it seemed to me that the efforts I wasted no longer cost the gains achieved in several bytes of memory and it’s time to move somewhere further. But, if you have always dreamed of something to commit to Chrome code, feel free to go the above way. For the sake of example, you can look at a few changes I have made:
Changes removing const:
Changes adding const:
Try it yourself
If you want to upgrade Chrome and look at the unigram_table initializer code before it disappears with the next release of Chrome, you don’t have to be a cool Chrome developer. Start by doing these two commands:
> “%VS140COMNTOOLS%..\..\VC\vcvarsall.bat” > devenv /debugexe chrome.exe
Make sure that you add the path to the Chrome symbolic server to the debugger settings (by following
this instruction ) and set a breakpoint to this symbol:
`dynamic initializer for 'unigram_table' 'Make sure that you do not have currently running Chrome and run it from under Visual Studio. Visual Studio will load Chrome characters (
magic of symbol servers! ) And set breakpoint on the initializer (if it still exists). Nothing complicated. You can switch to the assembler code mode (Ctrl + F11). If you want to see the source code - just enable the use of
the source code server in the settings of the Visual Studio debugger.