Fight Memory Leaks (C ++ CRT)

A memory leak is a rather serious and dangerous problem. Perhaps the user will not notice a single leak of some 32K of memory (and this is as much as 5% of 640Kb, which is “enough for everyone” ), but constantly losing complex hierarchical structures or arrays larger than INT_MAX ( which we love to create at 64 -bit architecture ) we will doom it to suffering, and our product to failure.

It’s not difficult to prevent a situation - let's use the rule “put everything in place”, but in practice it is greatly complicated by human factor (banal carelessness), architecture cunning and non-linear order of operators, for example, due to the use of exceptions.

And one could simply “surrender” to the automatic garbage collector, at the cost of performance loss (and this is not necessarily Managed C ++, for Native C ++ / C there are garbage collection libraries, for example).
')
But we will talk about the situation when “everything is bad”.

Then the task is reduced to the detection and correction of possible leaks - as for the correction, everything is usually simple ( delete or delete[] ). But how to detect a leak? Google will gladly tell you the answers, which usually boil down to the fact that:

need to use third-party leak analyzers
will have to reinvent the bikes and scooters
would be nice to write your own memory manager

But it can be easier using Debug CRT.

Step 1. Turn on Leak Accounting

To do this, you need to connect the Debug CRT header and enable the use of Debug Heap Alloc Map:

#ifdef _DEBUG #include <crtdbg.h> #define _CRTDBG_MAP_ALLOC #endif * This source code was highlighted with Source Code Highlighter .

Now, when allocating memory through new or malloc() data is wrapped in the following structure ( but in fact I’m a little cunning, the field responsible for data does not conform to the syntax of the struct and the structure itself is defined somewhere inside the CRT and its description is not programmer’s available ):

typedef struct _CrtMemBlockHeader { struct _CrtMemBlockHeader * pBlockHeaderNext; struct _CrtMemBlockHeader * pBlockHeaderPrev; char * szFileName; int nLine; size_t nDataSize; int nBlackUse; long lRequest; unsigned char gap[nNoMansLandSize]; unsigned char data[nDataSize]; unsigned char anotherGap[nNoMansLandSize]; } _CrtMemBlockHeader; * This source code was highlighted with Source Code Highlighter .

It contains information about the szFileName file szFileName and the nLine line where memory allocation occurred, the amount of requested nDataSize memory, and the data , wrapped in the so-called No Mans Land area. The BlockHeader themselves BlockHeader organized into a doubly linked list, which makes it easy to list them, and, accordingly, to identify all memory allocation operations for which there was no corresponding freeing operation.

Step 2. Leak enumeration

We need a function that will run through the CrtMemBlockHeader list and give us information about problem areas:

_CrtDumpMemoryLeaks();

Then in the Debug Output window we will see the following information:

Detected memory leaks! Dumping objects -> {163} normal block at 0x00128788, 4 bytes long. Data: < > 00 00 00 00 {162} normal block at 0x00128748, 4 bytes long. Data: < > 00 00 00 00 Object dump complete. * This source code was highlighted with Source Code Highlighter .

And it's almost cool, but this result is not yet usable for several reasons:

It does not contain information about the file and the line where the memory was allocated (and there is such information in the structure!)
I would very much like to output it to some log file (at least some automation)
It contains information without information, that is, not only the memory that has already “leaked” ...

... and also one that simply did not have time to "return" from global objects. And maybe global objects are bad, but now let's get used to the idea that they exist, which means that _CrtDumpMemoryLeaks() need to somehow remove them from the _CrtDumpMemoryLeaks() output. And this is solved by the following technique:

int _tmain( int argc, _TCHAR* argv[]) { _CrtMemState _ms; _CrtMemCheckpoint(&_ms); // some logic goes here... _CrtMemDumpAllObjectsSince(&_ms); return 0; } * This source code was highlighted with Source Code Highlighter .

We write the initial (current at the time of entry into main) memory state ( _CrtMemCheckpoint ) into a special structure, and before _CrtMemCheckpoint application we display all the remaining objects in memory ( _CrtMemDumpAllObjectsSince ) created after the moment _ms - these are the “leaks”. Now the information is correct, take care of its convenience.

Step 3. Presentation of the results

It is very easy to redirect output, here _CrtSetReportMode and _CrtSetReportFile functions will help us.

_CrtSetReportMode( _CRT_WARN, _CRTDBG_MODE_FILE ); _CrtSetReportFile( _CRT_WARN, _CRTDBG_FILE_STDOUT ); * This source code was highlighted with Source Code Highlighter .

Now the output of all warnings (as such is any output of _CrtMemDumpAllObjectsSince ) will go straight to stdout. The second parameter of the _CrtSetReportFile function can be used to set the actual file handle.

Why aren’t the names of files and strings where memory allocation occurred? It so happened that according to Microsoft Visual C ++ 6.0, the following redefinition of the new function in the crtdbg.h header was responsible for this information:

#ifdef _CRTDBG_MAP_ALLOC inline void * __cdecl operator new (unsigned int s) { return :: operator new (s, _NORMAL_BLOCK, __FILE__, __LINE__); } #endif /* _CRTDBG_MAP_ALLOC */ * This source code was highlighted with Source Code Highlighter .

And, it is not difficult to guess, it did not give the desired result, __FILE__:__LINE__ always unfolded in “crtdbg.h file line 512” . And then the guys from Microsoft removed this “feature” altogether, giving it to the programmer. Well, not scary, because to achieve this functionality can be one definition:

#define new new ( _NORMAL_BLOCK, __FILE__, __LINE__) * This source code was highlighted with Source Code Highlighter .

It is highly advisable to put Kotra into some general header file (it is necessary to include after crtdbg.h ). Problems will arise if new has already been redefined. Although, as I see it, any reasonable redefinition of new will not use CRT (otherwise it would be possible to use the hook technique), and the scheme, in this case, will not be applicable at all, well, okay.

In general, they now got what they wanted: here is an example of a conclusion, but I think it’s obvious that there should be.

Paycheck

Of course, it takes time and additional memory to organize and support CRT Internals structures. How much?

UPD: Anything below is valid only for Win32 (tested on Vista SP1).

Create 10 million int using new (40Mb of memory theoretically):

Debug CRT	~ 500Mb	3 seconds
Release	~ 160Mb	1 second

The figure of ~ 160Mb for release build may be a little surprising. But this is normal - new allocates memory through the OS function HeapAlloc , which aligns data at multiple 16 addresses (for Win32). Allocating memory for one character we get another 15 bytes, with which you can even do (but you don’t really need to do) something bad. For debug a very predictable result - add another sizeof(_CrtMemBlockHeader) multiplied by 10 million and we get, just, 500 megabytes.

An interesting empirical result was that in the release, a new int runs about 10% slower than HeapAlloc by 4 bytes, hardly distinguishable in speed from new int() (initialization by default, that is, zero), and faster by 5 -10% over HeapAlloc with the HEAP_ZERO_MEMORY flag.

Well, now 128 thousand int [256] via new int[256] (128Mb of memory theoretically):

Debug CRT	~ 136Mb	172 ms
Release	~ 128.5Mb	60 ms

The results are predictable and quite satisfactory. The speed ratio of 1: 3 was confirmed on data of a different size, including when mixing various data and partially freeing memory. But even without dynamic memory operations, Debug code runs several times slower than Release!

Conclusion

Memory leaks can be sorted out with bare hands. Of course, our “raw” output will not be as effective as the leakage tree, or the list of code locations sorted by descending total leakage (although this can all be generated easily according to our output). But for small projects or tasks can do the trick. And the method does not need support ( not quite “written and forgotten,” because of the redefinition of new , but close to that ), and the level of entry is much lower than for serious analyzers.

Well, perhaps that's all. Is that source for recreating a holistic picture.

UPD: The method is not going to compete with external analyzers, because the goals are somewhat different, but mentions of standing tuls are very welcome (only, please, without repetitions).

Source: https://habr.com/ru/post/82514/

All Articles