📜 ⬆️ ⬇️

Secure cleaning of private data


Often, the program must store private data. For example: passwords, keys and their derivatives. Very often after using this data, it is necessary to clear the memory of their tracks so that the attacker could not access them. This post will discuss why you cannot use the memset () function for this purpose.

memset ()


Perhaps you have already read an article describing the vulnerability of programs using memset () for mashing memory. But it does not fully disclose all possible cases of misuse of memset () . Problems will arise not only with clearing buffers created on the stack, but also with buffers allocated in dynamic memory.

Stack


First consider the case of the above article using a variable created on the stack.

Write the code that works with the password:
#include <string> #include <functional> #include <iostream> //  struct PrivateData { size_t m_hash; char m_pswd[100]; }; // -    void doSmth(PrivateData& data) { std::string s(data.m_pswd); std::hash<std::string> hash_fn; data.m_hash = hash_fn(s); } //      int funcPswd() { PrivateData data; std::cin >> data.m_pswd; doSmth(data); memset(&data, 0, sizeof(PrivateData)); return 1; } int main() { funcPswd(); return 0; } 

The example is rather arbitrary, it is completely synthetic.
')
If we build a debug version and execute such code under the debugger (I used Visual Studio 2015), we will see that everything is in order. Password and calculated hash are erased after use.

Let's look at the assembler code under the Visual Studio debugger:
 .... doSmth(data); 000000013F3072BF lea rcx,[data] 000000013F3072C3 call doSmth (013F30153Ch) memset(&data, 0, sizeof(PrivateData)); 000000013F3072C8 mov r8d,70h 000000013F3072CE xor edx,edx 000000013F3072D0 lea rcx,[data] 000000013F3072D4 call memset (013F301352h) return 1; 000000013F3072D9 mov eax,1 .... 

We observe a call to our memset () function, which will clear private data after use.

It would seem that this can be completed, but no, let's try to collect a release version with code optimization. Let's see in the debugger what we did:
 .... 000000013F7A1035 call std::operator>><char,std::char_traits<char> > (013F7A18B0h) 000000013F7A103A lea rcx,[rsp+20h] 000000013F7A103F call doSmth (013F7A1170h) return 0; 000000013F7A1044 xor eax,eax .... 

As you can see, all the instructions corresponding to the memset () function call have been removed. The compiler decided that it makes no sense to call the function clearing the data, since they are no longer used. This is not a bug, but a legitimate compiler action. From the point of view of the language, the memset () call is not needed, since the buffer is not used further. And if so, removing the memset () call will not affect the behavior of the program. Accordingly, our private data is not removed from memory, which is very bad.

A pile


And now let's dive deeper. Let's check what will happen to the data that will be placed in dynamic memory using the malloc function or the new operator.

Modify our code to work with malloc :
 #include <string> #include <functional> #include <iostream> struct PrivateData { size_t m_hash; char m_pswd[100]; }; void doSmth(PrivateData& data) { std::string s(data.m_pswd); std::hash<std::string> hash_fn; data.m_hash = hash_fn(s); } int funcPswd() { PrivateData* data = (PrivateData*)malloc(sizeof(PrivateData)); std::cin >> data->m_pswd; doSmth(*data); memset(data, 0, sizeof(PrivateData)); free(data); return 1; } int main() { funcPswd(); return 0; } 

We will check the release version, as in the Debug all calls are in their places. After compilation in Visual Studio 2015, we will see the assembler code:
 .... 000000013FBB1021 mov rcx, qword ptr [__imp_std::cin (013FBB30D8h)] 000000013FBB1028 mov rbx,rax 000000013FBB102B lea rdx,[rax+8] 000000013FBB102F call std::operator>><char,std::char_traits<char> > (013FBB18B0h) 000000013FBB1034 mov rcx,rbx 000000013FBB1037 call doSmth (013FBB1170h) 000000013FBB103C xor edx,edx 000000013FBB103E mov rcx,rbx 000000013FBB1041 lea r8d,[rdx+70h] 000000013FBB1045 call memset (013FBB2A2Eh) 000000013FBB104A mov rcx,rbx 000000013FBB104D call qword ptr [__imp_free (013FBB3170h)] return 0; 000000013FBB1053 xor eax,eax .... 

As you can see, in this case, Visual Studio is fine, our data cleaning works. But let's see what other compilers will do. Let's try to use gcc version 5.2.1 and clang version 3.7.0.

For gcc and clang, I modified the source code a bit, adding a listing of the contents in the allocated memory before the clearing and after clearing the memory was added. I printed the contents of the pointer after the release of memory. In real programs, this cannot be done, since it is completely unknown how the program will behave in this case. But for the experiment, I allowed myself such a liberty.
 .... #include "string.h" .... size_t len = strlen(data->m_pswd); for (int i = 0; i < len; ++i) printf("%c", data->m_pswd[i]); printf("| %zu \n", data->m_hash); memset(data, 0, sizeof(PrivateData)); free(data); for (int i = 0; i < len; ++i) printf("%c", data->m_pswd[i]); printf("| %zu \n", data->m_hash); .... 

So, the assembler code fragment created by the gcc compiler:
 movq (%r12), %rsi movl $.LC2, %edi xorl %eax, %eax call printf movq %r12, %rdi call free 

Immediately after the printout of the content ( printf ), we see the call to the function free () , and the call to the function memset () is deleted. If we execute the code and enter an arbitrary password (for example, “MyTopSecret”), we will get the following output on the screen:

MyTopSecret | 7882334103340833743

MyTopSecret | 0

Hash has changed. Apparently this is a side effect of the memory manager. Our secret password "MyTopSecret", remained in the inviolable form in the memory.

Now let's check for clang :
 movq (%r14), %rsi movl $.L.str.1, %edi xorl %eax, %eax callq printf movq %r14, %rdi callq free 

Observe a similar picture, the memset () call is deleted. The output to the screen looks the same way:

MyTopSecret | 7882334103340833743

MyTopSecret | 0

In this case, both gcc and clang decided to optimize the code. Since the memory is released after the memset () function call, the compilers consider this call unnecessary and delete it.

As it turned out, compilers, when optimized, remove the memset () call when using the application’s stack and dynamic memory.

And finally, we’ll check how compilers behave when allocating memory with new .

Once again we modify the code:
 #include <string> #include <functional> #include <iostream> #include "string.h" struct PrivateData { size_t m_hash; char m_pswd[100]; }; void doSmth(PrivateData& data) { std::string s(data.m_pswd); std::hash<std::string> hash_fn; data.m_hash = hash_fn(s); } int funcPswd() { PrivateData* data = new PrivateData(); std::cin >> data->m_pswd; doSmth(*data); memset(data, 0, sizeof(PrivateData)); delete data; return 1; } int main() { funcPswd(); return 0; } 

Visual Studio conscientiously cleans the memory:
 000000013FEB1044 call doSmth (013FEB1180h) 000000013FEB1049 xor edx,edx 000000013FEB104B mov rcx,rbx 000000013FEB104E lea r8d,[rdx+70h] 000000013FEB1052 call memset (013FEB2A3Eh) 000000013FEB1057 mov edx,70h 000000013FEB105C mov rcx,rbx 000000013FEB105F call operator delete (013FEB1BA8h) return 0; 000000013FEB1064 xor eax,eax 

The gcc compiler in this case also decided to leave the code to clear the memory:
 call printf movq %r13, %rdi movq %rbp, %rcx xorl %eax, %eax andq $-8, %rdi movq $0, 0(%rbp) movq $0, 104(%rbp) subq %rdi, %rcx addl $112, %ecx shrl $3, %ecx rep stosq movq %rbp, %rdi call _ZdlPv 

The output on the screen has changed accordingly, our data has been deleted:

MyTopSecret | 7882334103340833743

| 0

But clang decided to optimize our code again and cut out the “unnecessary” function:
 movq (%r14), %rsi movl $.L.str.1, %edi xorl %eax, %eax callq printf movq %r14, %rdi callq _ZdlPv 

Print the contents of the memory:

MyTopSecret | 7882334103340833743

MyTopSecret | 0

The password remained in the memory and wait for it to be stolen.

Let's sum up. As a result of our experiment, it turned out that the compiler, by optimizing the code, can remove the memset () function call when using any memory, both stack and dynamic. Although Visual Studio did not delete memset () calls when using dynamic memory, you can’t count on this in any case. Perhaps when using other compilation flags, the effect will manifest itself. From our little research, it follows that you cannot rely on the memset () function to clear private data.

How to clean private data correctly?

You should use specialized memory cleaning functions that cannot be deleted by the compiler during code optimization.

In Visual Studio, for example, you can use RtlSecureZeroMemory . Starting from C11, there is a memset_s function. If necessary, you can create your own safe function. There are many examples on the Internet how to do it. Here are some of the options.

Option N1 .
 errno_t memset_s(void *v, rsize_t smax, int c, rsize_t n) { if (v == NULL) return EINVAL; if (smax > RSIZE_MAX) return EINVAL; if (n > smax) return EINVAL; volatile unsigned char *p = v; while (smax-- && n--) { *p++ = c; } return 0; } 

Option N2 .
 void secure_zero(void *s, size_t n) { volatile char *p = s; while (n--) *p++ = 0; } 

Some go further and make a function that fills the array with pseudo-random values ​​and at the same time work different times to make it difficult for attacks related to time measurement. Their implementation can also be found on the Internet.

Conclusion


The PVS-Studio static analyzer is able to find such errors. It signals a problem situation using the V597 diagnostic. This article is written as an extended description of why this diagnosis is important. Unfortunately, many programmers believe that the analyzer “fights” on their code and in fact there is no problem. After all, the programmer sees the memset () function call in the debugger, forgetting that this is a debug version.


If you want to share this article with an English-speaking audience, then please use the link to the translation: Roman Fomichev. Safe Clearing of Private Data .

Source: https://habr.com/ru/post/281072/


All Articles