📜 ⬆️ ⬇️

Without knowing the ford, do not go into the water. Part two

Horrible printf
This time I want to talk about the printf function. Everyone has heard about program vulnerabilities, and that functions like printf are outlawed. But one thing to know that it is better not to use these functions. But quite another is to understand why. In this article I will describe two classic vulnerabilities of printf related programs. After that, you will not become a hacker, but perhaps take a fresh look at your code. Suddenly, you are implementing similar vulnerabilities without even knowing it.

STOP. Wait reader, do not pass by. I know you saw the word printf. And I am sure that the author of the article will now tell a banal story that the function does not control the types of arguments passed. Not! The article will not be about it, but about vulnerabilities. Come to read.

The previous note is here: Part One .

Introduction


Let's take a look at this line:
 printf (name); 

It seems simple and harmless. Meanwhile, it hides at least two ways to attack the program.
')
We start the article with a demo where this line is. The code may seem strange to you. The way it is. It turned out that it was not so easy to write a program in order to attack it later. The point is the optimization that the compiler produces. It turns out that if you write too simple a program, the compiler creates such code that there’s nothing to break. It uses registers, not a stack for storing data, embeds functions, and the like. You can write code with unnecessary actions and cycles, so that the compiler does not have enough free registers, and he began to push data onto the stack. Unfortunately, the example is too big and confusing. About all this you can write a separate detective story, but we will not.

The presented example is a compromise between complexity and the need to prevent the compiler from “collapsing into nothing” in too simple code. I admit, I helped myself a little bit anyway. I disabled some types of optimization used in Visual Studio 2010. First, the / GL (Whole Program Optimization) key was disabled. Secondly, I used the attribute __declspec (noinline).

I apologize for such a long introduction. I wanted to explain the clumsiness of the program code. And immediately stop the discussion on the topic that this code can be written better. I know what you can. But it’s impossible to make the code both short at the same time, and so that you can show the vulnerability.

Demo


The full code and project for Visual Studio 2010 is available here .
  const size_t MAX_NAME_LEN = 60;
 enum ErrorStatus {
   E_ToShortName, E_ToShortPass, E_BigName, E_OK
 };

 void PrintNormalizedName (const char * raw_name)
 {
   char name [MAX_NAME_LEN + 1];
   strcpy (name, raw_name);

   for (size_t i = 0; name [i]! = '\ 0'; ++ i)
     name [i] = tolower (name [i]);
   name [0] = toupper (name [0]);

   printf (name);
 }

 ErrorStatus IsCorrectPassword (
   const char * universalPassword,
   BOOL & retIsOkPass)
 {
   string name, password;
   printf ("Name:");  cin >> name;
   printf ("Password:");  cin >> password;
   if (name.length () <1) return E_ToShortName;
   if (name.length ()> MAX_NAME_LEN) return E_BigName;
   if (password.length () <1) return E_ToShortPass;

   retIsOkPass = 
     universalPassword! = NULL &&
     strcmp (password.c_str (), universalPassword) == 0;
   if (! retIsOkPass)
     retIsOkPass = name [0] == password [0];

   printf ("Hello,");
   PrintNormalizedName (name.c_str ());

   return E_OK;
 }

 int _tmain (int, char * [])
 {
   _set_printf_count_output (1);
   char universal [] = "_Universal_Pass_!";
   BOOL isOkPassword = FALSE;
   ErrorStatus status =
     IsCorrectPassword (universal, isOkPassword);
   if (status == E_OK && isOkPassword)
     printf ("\ nPassword: OK \ n");
   else
     printf ("\ nPassword: ERROR \ n");
   return 0;
 } 

The _tmain () function calls the IsCorrectPassword () function. If the password is correct or if it coincides with the magic word "_Universal_Pass_!", Then the program displays the string "Password: OK". The purpose of the attacks will be to ensure that the program displays exactly this line.

The IsCorrectPassword () function prompts the user for a username and password. The password is considered correct if it coincides with the magic word passed to the function. It is also correct if the first letter of the password matches the first letter of the name.

Regardless of whether the correct password is entered or not, the program welcomes the user. To do this, call the PrintNormalizedName () function.

The PrintNormalizedName () function is all the fun. It is there that the “printf (name);” is discussed. Think about how you can trick the program with this line. If you know how, then you can not read further.

What does the PrintNormalizedName () function do? She prints the name, making the first letter capitalized, and the rest small. For example, if you enter the name "andREy2008", it will print "Andrey2008".

First attack


Suppose we do not know the correct password. But we know that somewhere there is a certain magic password. Let's try to search for it using printf (). If the address of this password is somewhere on the stack, then we have a chance of success. Any idea how to see this password on the screen?

I give a hint. The printf () function refers to a family of functions with a variable number of arguments. Such functions work as follows. An arbitrary amount of data is written to the stack. The printf () function does not know how much data is written to the stack and what type they have. It is guided solely by the format string. If it says "% d% s", it means that one int type and one pointer should be retrieved from the stack. Since the printf () function does not know how many arguments were passed to it, it can look deeper into the stack and print data that has nothing to do with it. Typically, this leads to access violation or print garbage. However, this garbage can be used.

Consider what the stack might look like when we call the printf () function:
Figure 1. Schematic data layout on the stack.
Figure 1. Schematic data layout on the stack.

The call to the “printf (name);” function has only one argument, which is a format string. This means that if we enter “% d” instead of the name, we will print the data that is on the stack before the return address to the PrintNormalizedName () function. Let's try:

Name:% d
Password: 1
Hello, 37
Password: ERROR

While this action is little meaningful. At a minimum, at first we have to print the return addresses and the entire contents of the buffer char name [MAX_NAME_LEN + 1]; which is also located on the stack. And only then, perhaps, we will get to something interesting.

If the attacker does not have the ability to disassemble or debug the program, then it is difficult for him to understand whether he will find something on the stack or not. However, it can act as follows.

At the beginning, enter: "% s". Then enter "% x% s". Then enter "% x% x% s" and so on. This way, the hacker will iterate through the data on the stack, and try to print them as a string. Here he is helped by the fact that all the data in the stack is aligned, at least on the border of 4 bytes.

To be honest, by acting in this way, we will fail. We will exceed the limit of 60 characters, without printing anything useful. We will come to the aid of "% f", which is designed to print double values. Therefore, with its help we will be able to move along the stack immediately to 8 bytes.

And here it is - the long-awaited line:

% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% x (% s)

Result:
Figure 2. Password printout. Click on the picture to enlarge.
Figure 2. Password printout. Click on the picture to enlarge.

Let's try this line as a magic password:

Name: Aaa
Password: _Universal_Pass_!
Hello, Aaa
Password: OK

Hooray! We were able to find and display private data to which the program did not plan to give us access. Moreover, note that for this there is no need to have access to the binary code of the program. Enough diligence and perseverance.

Conclusions on the first attack


This method of obtaining private data should be considered more broadly. When developing programs containing functions with a variable number of arguments, consider whether there are situations when data can flow through them to the outside world. This can be a log file, a packet transmitted over the network, and so on.

In the case considered, the attack became possible due to the fact that the input to the printf () function is a string that can contain control commands. To avoid this, it was enough to write this:
  printf ("% s", name); 


Second attack


Do you know that the printf () function can modify memory? Most likely, you read about it, but forgot. This is the "% n" specifier. It allows you to write to the specified address the number of characters that printf () has already printed.

To be honest, an attack based on the specifier "% n" is purely historical. Starting with Visual Studio 2005, the option to use "% n" is disabled by default. To carry out this attack I had to explicitly allow this specifier. Here is this magical effect:
  _set_printf_count_output (1); 

To make it clearer, I will give an example of using "% n":
  int i;
 printf ("12345% n6789 \ n", & i);
 printf ("i =% d \ n", i); 

The output of the program:

123456789
i = 5

We have already learned how to get to the required pointer that is in the stack. And now we have a tool in our hands that allows us to modify the memory according to this pointer.

Of course, it is inconvenient to use. First, we can only write 4 bytes at once (the size of the int type). If we need a large number, then the printf () function will first have to output a lot of characters. To avoid this, the% 00u specifier can help. The qualifier affects the value of the current number of bytes output. More detail to delve into the subtleties will not.

In our case, everything is easier. It is enough for us to write in the isOkPassword variable any unequal value 0. The address of this variable is passed to the IsCorrectPassword () function, which means it is somewhere on the stack. Do not be confused by the fact that the variable is transmitted as a link. At a low level, the link is an ordinary pointer.

Here is the line that allows us to modify the IsCorrectPassword variable:

% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% f% n

The qualifier "% n" does not take into account the number of characters derived from specifiers such as "% f". Therefore, we put one space before "% n" to write the value 1 to isOkPassword.

We try:
Figure 3. Write to memory. Click on the picture to enlarge.
Figure 3. Write to memory. Click on the picture to enlarge.

Impressive? But that's not all. You can record almost any address. If the output string is on the stack, then we can get to the desired characters and use them as an address.

For example, we can write a string containing consecutive characters with the codes 'xF8', 'x32', 'x01', 'x7F'. It turns out that the string has a hard-coded number, which is equivalent to the value 0x7F0132F8. At the end we put the specifier "% n". Using "% x" or other specifiers, we can get to the coded number 0x7F0132F8 and record the number of characters displayed at this address. This method has limitations, but it is still very curious.

Conclusions on the second attack


We can say that the attack of the second kind is now hardly possible. As you can see, support for the "% n" specifier in modern libraries is disabled by default. However, you can create your own homemade mechanism, which will be predisposed to this type of vulnerability. Be careful when the data entered from the outside, control what and where to write to memory.

Specifically, in this case, the problem will not arise again, if you write this:
  printf ("% s", name); 


General conclusions


Only two simple examples of vulnerability are considered here. Of course, there are many more. It does not attempt to describe or at least list them. The article planned to show that even such a simple construct as printf (name) can be dangerous.

This implies an important conclusion. If you are not a security professional, then it is better to follow all the recommendations that are written about. The essence of the recommendations is too thin to assess the whole range of threats on their own. After all, you probably read that printf () is a dangerous function. But I am sure that many of those reading this article first learned about the depth of the rabbit hole.

If you are writing an application that could potentially serve as an object of attack, observe the maximum accuracy. The fact that in your opinion is completely innocent code may contain a vulnerability. If you do not see a dirty trick in the code, this does not mean that it does not exist.

Follow all compiler recommendations for using updated versions of string functions. This means using sprintf_s instead of sprintf and so on.

Even better, discard the low-level work with strings altogether. These functions are the heritage of the C language. Now there is a std :: string. There are safe ways to form strings, such as boost :: format or std :: stringstream.

PS Someone, having read the conclusion, said - “it was clear that way”. But be honest. Before reading this article, did you know and remember that printf () can write to memory? But this is a big vulnerability. At least, it was before. Now there are others, no less insidious.

Source: https://habr.com/ru/post/137411/


All Articles