📜 ⬆️ ⬇️

Leaks do not violate memory safety

Errors of access to memory and memory leaks are two categories of errors that attract the most attention, so much effort is aimed at preventing or at least reducing their number. Although their name suggests similarity, they are diametrically opposed in some way, and the solution of one of the problems does not save us from the second. The wide distribution of managed languages ​​confirms this idea: they prevent some memory access errors by taking over the job of freeing memory.

Simply put: a memory access violation is some kind of action with incorrect data, and a memory leak is the absence of certain action with correct data . In tabular form:


     OK         OK 

The best programs perform only actions from OK cells: they manipulate correct data and do not manipulate incorrect ones. Eligible programs may also contain some valid, but unused data (memory leaks), and bad ones try to use incorrect data.

When a language promises safe memory handling, as Rust does, it does not guarantee the impossibility of memory leaks .
')


Effects


The most important difference between memory access errors and leaks in practice is manifested in the potential consequences: the former lead to very serious problems, while the latter are simply annoying.

Memory security is a key element of any other form of security / validity. If the program makes errors with memory, it is difficult to give any guarantees about its behavior, since the possibility of memory damage is not excluded. Attackers can use memory access errors in such a program to read confidential keys directly from server memory or execute arbitrary code on your computer.

On the other hand, a memory leak in the worst case will lead to a denial of service: a useful program will crash due to using too much memory, and the computer may stop responding because of its deficit. This situation may also be caused by an attacker, but methods to combat this have long been developed. Of course, the denial of service is very annoying and in some places is a critical problem, but potential memory access errors are no less, and maybe more, a problem. In addition, given the unpredictability of possible memory access errors, this may still lead to a denial of service.

As a result, most programming languages ​​prefer to put up with memory leaks (assuming no release or cleaning of data after the last use), but not with memory access errors. Thus, most "safe" languages ​​ensure that the programs written in them do not contain such errors, unless you consciously decide to bypass the restrictions (for example, using the ctypes module in Python or the unsafe keyword in Rust). As for leaks, they are trying to fight them (as a rule), but they do not give any guarantees.


delete free


Memory access errors occur for several reasons, but one category stands out when we discuss memory management. I will quote Wikipedia:


Errors of working with dynamic memory - incorrect use of dynamic memory and pointers:



In this list, only a reference to the null pointer is not caused by an incorrect freeing of memory (by calling the free function to mark unused memory and return to the operating system). Thus, all these problems can be avoided simply by never causing free : if the memory is never released, then there will be no problems associated with it. Referring to the table above: removing the release of memory, we remove the column "incorrect data" - all data will always be correct.

Of course, a simple ban on calling free has its drawbacks (although it has advantages, apart from the lack of problems with freeing memory: there are no difficulties with understanding the data lifetime, which simplifies writing many parallel algorithms). In particular, it becomes problematic to write programs so that they do not exhaust all available memory. However, computers, unlike people, are not mistaken, so perhaps we can pass the free call onto their shoulders ...


Leak optimization


Much of the modern code is written in languages ​​designed to ensure the security of working with memory, such as Java, Javascript, Python or Ruby. They do without an explicit free call and automatically manage memory (hence the name "managed languages") using the garbage collector built into the language runtime environment.

In fact, garbage collection is a way to provide the programmer and program with the illusion of infinite memory and get rid of the need to carefully track the moment when memory can be freed. You focus on the logic of the subject area, and the garbage collector automatically frees the guaranteed unused portions of memory. Virtually all garbage collectors conservatively determine data that can be deleted if they are not referenced (so the garbage collector must track or have access to all memory allocations).

It is worth noting that high-quality implementations of the garbage collector provide additional benefits: memory allocation is usually implemented as a simple pointer shift in the presence of generations of objects, and the capabilities of the moving garbage collector improve cache locality (which is especially useful, given that access to data and this is done by a pointer in most managed languages). However, these features are not relevant to the topic of the article.

In practice, programmers almost never have to think about the fact that memory is not infinite, but the safety of working with memory, as desired, is ensured. In high-performance code, you often have to resort to all sorts of tricks (for example, object pools that help avoid memory allocation with frequent creation and deletion of objects) in order to circumvent the inefficiency of garbage collection. It is also possible that the data due to forgotten links live longer than necessary.

Nevertheless, the goal is achieved even with the practical problems: the absence of free calls guarantees the absence of (some) problems with memory.


Less abstractions


I can not fail to mention an alternative to automatic memory management: instead of trying to get rid of the “Incorrect data” column as a whole, you can guarantee only the absence of memory security problems. The programming language Rust does exactly that.

This approach eliminates the need for manual release of resources, although the drop function allows you to call the destructor ahead of time. Unlike C and C ++, at the compilation stage, Rust prohibits further use of such incorrect data and prevents errors.

However, this model does not guarantee the absence of leaks: the revised table for Rust (and languages ​​with the same principle) still has the Memory Leak cell.


       OK      OK 

Many do not see the difference between "memory leaks" and "memory security." Having heard that Rust guarantees the safety of working with memory, they simply expect protection against leaks and do not understand that this language can do things that modern C ++ does not know about low-level programming. Rust does not allow access memory errors, but does not exclude leaks.


std::mem::forget


And finally, in Rust there is a function forget , which marks the data as released, preventing further access to it, but does not cause a destructor, which potentially leads to a memory leak. For a long time, this function was marked as unsafe, that is, Rust implicitly implied that memory leaks are something that a programmer must consciously choose, similar to the security of working with memory. In practice, however, pointers with reference counting or interlocking of threads can lead to leaks. As a result, the forget function was made safe by focusing on preventing memory access errors, although making all possible efforts to combat leaks, however, like all other languages.

Like modern C ++, Rust does a pretty good job: managing resources based on RAII, namely destructors, are powerful tools for managing memory ( and not only that ), especially in combination with the default movement semantics used in Rust. Lack of leakage is not guaranteed for two reasons:



The standard library Rust expects that leaks are safe, although they can lead to incorrect operation. In other words, you can get undesirable behavior if the data is not released, but the consequences are less destructive than segmentation fault or memory corruption.

Source: https://habr.com/ru/post/281370/


All Articles