
In C and C ++ there is the keyword
volatile , which indicates to the compiler that the value in the corresponding memory area can be changed at an arbitrary moment and therefore it is impossible to optimize access to this area. Usually, the description of a keyword immediately gives an example with data that can be changed at any time from another thread, by hardware or by the operating system. After reading the description of the example, the majority of readers deeply yawns, decides that in this life they will not need this, and proceeds to the next section.
Today we will consider a less exotic scenario of using the
volatile keyword.
')
The C ++ standard defines the so-called observable behavior as a sequence of input-output and read-write operations of data declared as
volatile (1.9 / 6). Within the limits of the preservation of the observed behavior, the compiler is allowed to optimize the code in any way.
For example ... Your code allocates memory using the operating system, and you want the operating system to allocate physical memory pages for the entire requested area. Many operating systems allocate pages at the first real access, and this may lead to additional delays, and you, for example, want to avoid these delays and move them to an earlier point. You can write this code:
for( char* ptr = start; ptr < start + size; ptr += MemoryPageSize ) { *ptr; }
This code runs across the region and reads one byte from each page of memory. One problem - the compiler optimizes this code and completely removes it. It has the full right - this code does not affect the observed behavior. Your experiences about the allocation of pages by the operating system and the resulting delays do not apply to the observed behavior.
What to do, what to do ... Oh, right! Let's ban the compiler from optimizing this code.
#pragma optimize( "", off ) for( char* ptr = start; ptr < start + size; ptr += MemoryPageSize ) { *ptr; } #pragma optimize( "", on )
Great, as a result ...
1. used
#pragma , which makes the code poorly portable, plus ...
2. optimization is turned off completely, and this increases the amount of machine code three times, plus in Visual C ++, for example, this
#pragma can only be used outside the function, respectively, it is not necessary to rely on embedding this code in the calling code and further optimization.
The keyword
volatile would help a
lot here :
for( volatile char* ptr = start; ptr < start + size; ptr += MemoryPageSize ) { *ptr; }
And everything is achieved exactly the desired effect - the code instructs the compiler to be sure to perform the reading with the specified step. Optimization by the compiler does not have the right to change this behavior, because now the sequence of readings refers to the observed behavior.
Now let's try to overwrite the memory in the name of security and paranoia (this is not nonsense,
this is how it happens in real life ). In that post, a certain magic function
SecureZeroMemory () is mentioned that supposedly guaranteed to overwrite the specified memory area with zeros. If you use
memset () or an equivalent self-written cycle, for example, like this:
for( size_t index = 0; index < size; index++ ) ptr[index] = 0;
for a local variable, that is, the risk that the compiler will delete this cycle, because the cycle does not affect the observed behavior (the arguments in that post also do not apply to the observed behavior).
What to do, what to do ... And, we "deceive" the compiler ... Here is what can be found at the request of "prevent memset optimization":
1. replacing a local variable with a variable in dynamic memory with all the attendant overhead and the risk of leakage (
message in the linux-kernel distribution archive )
2. macro with assembler magic (
message in the linux-kernel mailing list archive )
3. the proposal to use a special preprocessor symbol that prohibits embedding
memset () in place and makes it difficult for the compiler to optimize (of course, this feature should be supported in the library version used, plus Visual C ++ 10 can even optimize the code of functions marked as non-embeddable)
4. various read-write sequences using global variables (the code becomes noticeably larger and such code is not thread safe)
5. subsequent reading with an error message in case if the wrong data was written that was written (the compiler has the right to notice that there is no “wrong data” and delete this code)
All these methods have many common features - they are poorly portable and difficult to verify. For example, you "deceived" some version of the compiler, and the newer one will have a smarter analyzer, which will guess that the code does not make sense, and will remove it, and will not do so everywhere, but only in some places.
You can compile the rewrite function into a separate translation unit so that the compiler does not “see” what it does. After the next compiler change, code generation by the linker will enter the game (LTCG in Visual C ++, LTO in gcc or as it is called in the compiler you use) - and the compiler will recover the memory and see that overwriting the memory "does not make sense" and deletes it.
No wonder there was a saying
you can't lie to a compiler .
But what if you look at the typical implementation of
SecureZeroMemory () ? It is essentially as follows:
volatile char *volatilePtr = static_cast<volatile char*>(ptr); for( size_t index; index < size; index++ ) * volatilePtr = 0; }
And that's all - the compiler no longer has the right to delete the record ...EXTREMELY UNEXPECTED ... despite all superstitions, the
strikethrough statement above is incorrect .
In fact - it has. The standard says that a read-write sequence should be saved only for data with a
volatile qualifier. Here for such:
volatile buffer[size];
If the data itself does not have a
volatile qualifier, and the
volatile qualifier is added to a pointer to this data, the read-write of this data no longer applies to the observed behavior:
buffer[size]; SecureZeroMemory(buffer, sizeof(buffer));
All hope for compiler developers - at the moment both Visual C ++ and gcc do not optimize memory accesses via pointers with a
volatile qualifier - including because this is one of the important use cases for such pointers.
There is no standard-guaranteed way to overwrite data with a function equivalent to
SecureZeroMemory () if the variable with this data does not have a
volatile qualifier. Similarly, it is impossible to code as at the very beginning of the post is guaranteed to read the memory. All possible solutions are not completely portable.
The reason for this is trivial - it is “not necessary.”
Situations where a variable with data to be written goes out of scope, and then the memory occupied by it is reused for another variable and the new variable is read without prior initialization, refer to undefined behavior. The standard clearly states that in such cases any behavior is permissible. Usually just read "garbage", which was recorded in this memory before.
Therefore, from the standpoint of the Standard, guaranteed rewriting of such variables before going out of scope does not make sense. Similarly, it does not make sense to read memory for the sake of reading memory.
Using
volatile pointers is most likely the most effective way to solve a problem. First, compiler developers usually deliberately turn off memory access optimization. Secondly, the overhead is minimal. Thirdly, it is relatively easy to check whether this method works on a specific implementation — just look at which machine code will be generated for the trivial examples above from this post.
volatile - not only for drivers and operating systems.
Dmitry Mescheryakov,
product department for developers