📜 ⬆️ ⬇️

delete, new [] in C ++ and urban legends about their combination

If in C ++ code an array of objects was created with the help of “new []”, you should delete this array with the help of “delete []” and in no case with the help of “delete” (without parentheses). A reasonable question: not what?

This question can get a wide range of unreasonable answers. For example, “only the first object will be deleted, the rest will leak” or “only the first object’s destructor will be called”. The following "explanations" usually do not stand any serious criticism.

In accordance with the C ++ Standard, the behavior is undefined in this situation. All assumptions are nothing more than popular urban legends. Let us examine in detail why.
')
We will need a cunning plan with an example that would confound the supporters of urban legends. Here is such a harmless will be ok:

 class Class { public: ~Class() { printf( "Class::~Class()" ); } }; int main() { delete new Class[1]; return 0; } 

There is only one object in the array. If you believe any of the two legends above, “everything will be fine” - there is nothing to leak and nowhere, destructors will be called exactly as needed.

Go to codepad.org, paste the code into the form, we get the output:

  memory clobbered before allocated block
 Exited: ExitFailure 127
 42 75 67 20 61 73 73 61 73 73 69 6E 20 77
 61 6E 74 65 64 20 2D 20 77 77 77 2E 61 62
 62 79 79 2E 72 75 2F 76 61 63 61 6E 63 79

MEMORY WHAT ??? What was it?

Second example:

 int main() { delete new char[1]; return 0; } 

Issuance:

  No errors or program output. 

Here at least everything looks good. What's happening? Why it happens? Why is the behavior different?

The reason is what happens inside.

When “new Type [count]” is found in the code, the program must allocate enough memory to store the specified number of objects. For this, it uses the function "operator new [] ()". This function allocates memory — usually inside just a call to malloc () and checking the return value (if necessary, calling new_handler () and throwing an exception). Then objects are constructed in the allocated memory - the required number of constructors is called. The result of "new Type [count]" is the address of the first element of the array.

When “delete [] pointer” appears in the code, the program should destroy all the objects in the array, causing destructors for them. For this (and only for this) she needs to know the number of elements.

The important point: in the construction “new Type [count]” the number of elements was indicated explicitly, and only the address of the first element receives “delete []”.

Where does the program find out the number of elements? Since she only has the address of the first element, she must calculate the length of the array based on that address alone. How this is done depends on the implementation, the following method is usually used.

When executing “new Type [count]”, the program allocates enough memory to fit not only objects, but also an unsigned integer (usually of the size_t type) indicating the number of objects. This number is written to the beginning of the selected area, then objects are placed. When compiling "new Type [count]", the compiler inserts code into the program that implements these whistles.

So, when executing “new Type [count]”, the program allocates a little more memory, writes the number of elements to the beginning of the allocated memory block, calls the constructors and returns the address of the first element to the calling code. The address of the first element will differ from the address returned by the memory allocation function “operator new [] ()”.

When executing “delete []”, the program takes the address of the first element passed to “delete []”, determines the address of the beginning of the block (subtracting exactly the same number as was added when executing “new []”), reads the number of elements from the beginning of the block, calls the required number of destructors, then calls the function “operator delete [] ()”, passing it the address of the beginning of the block.

In both cases, the calling code does not work with the address that was returned by the memory allocation function and later transferred to the memory free function.

Now back to the first example. When “delete” is executed (without parentheses), the calling code has no idea that it is necessary to play a sequence with an address offset. Most likely, it calls the destructor of a single object, then passes to the function “operator delete ()” an address that differs from the previously returned function “operator new [] ()”.

What is going to happen? In this implementation, the program crashes. Since the Standard says that behavior is undefined, this is valid.

For comparison, the Visual C ++ 9 program by default comes with error messages in the debug version, but it seems to work fine (at least, the _heapchk () function returns _HEAP_OK, _CrtDumpMemoryLeaks () does not issue any messages). This is also permissible.

Why is the behavior different in the second example? Most likely, the compiler took into account that the type char has a trivial destructor, i.e. there is no need to do anything to destroy objects, but rather just freeing the memory, therefore, the number of elements should not be stored, which means that you can immediately return the same address returned by the function "operator new [] ()" to the calling code. No address offsets - just like when calling “new” (without parentheses). This behavior of the compiler is fully consistent with the Standard.

Something is missing…

Have you already noticed that the functions of allocating and freeing memory with square brackets or without are found above? These are not typos - they are two different pairs of functions, they can be implemented in completely different ways. Even when the compiler tries to save money, it always calls the function “operator new [] ()” when it sees in the code “new Type [count]”, and always calls the function “operator new ()” when it sees in the code “new Type” .

Typically, the implementation of the functions “operator new ()” and “operator new [] ()” are the same (both cause malloc ()), but they can be replaced - you can define your own, and you can replace both one pair and both, you can also replace these functions separately for any selected class. The standard allows you to do this as much as you like (of course, you need to adequately replace the pair function of freeing memory).

This provides rich opportunities for indefinite behavior. If your code causes the memory to be released by the “wrong” function, this can lead to any consequences, such as heap corruption, memory corruption, or immediate program termination. In the first example, the implementation of the “operator delete ()” function could not manage the address given to it and the program crashed.

The most enjoyable part of this story is that you can never argue that using “delete” instead of “delete []” (and vice versa, too) leads to some specific result. The standard says that behavior is undefined. Even a fully compliant Standard compiler is not required to give you a program with any adequate behavior. The behavior of the program, to which you will refer in comments and disputes, is only observable - anything can happen inside. You are just stating the behavior you observe.

In the second example, everything looks fine ... on this implementation. On another implementation, the functions “operator new ()” and “operator new [] ()” can be, for example, implemented on different heaps (Windows allows you to create more than one heap per process). What happens when I try to return a block to the “wrong” pile?

By the way, relying on some specific behavior in this situation, you automatically get an intolerable code. Even if the current implementation of "everything works", when switching to another compiler, changing the version of the compiler, or even updating the C ++ runtime, you may be extremely unpleasantly surprised.

How to be? To accept, not to confuse “delete” and “delete []” and most importantly - not to waste time on “plausible” explanations of what allegedly will happen if you mix them up. While you argue, other developers will do something useful, and you will be more likely to earn the Darwin Award.

Dmitry Mescheryakov
Product Development Department

Source: https://habr.com/ru/post/117208/


All Articles