Introduction to reverse engineering with Radare2

Radare2 is a framework for analyzing binary files. It includes a large number of utilities. Initially, it developed as a hex editor for searching and recovering data, then it acquired functionality and has now become a powerful framework for analyzing data. In this article I will tell how to use the Radare2 framework to analyze the logic of the program, as well as describe the main elements of the assembly language that are necessary for reverse engineering.

Radare2 is a set of several utilities:

radare2 (r2) - Hex editor, disassembler and debugger with extended command line interface. It allows you to work with various input / output devices, such as disks, remote devices, debugging processes, etc., and also to work with them as with simple files.
rabin2 - Used to get information about executable binary files.
rasm2 - Allows conversion from opcode to machine code and vice versa. Supports a large number of architectures.
rahash2 - A utility designed to calculate checksums. It supports many algorithms, allows you to get the checksum of the whole file, its part or an arbitrary string
radiff2 - A utility for comparing binary files, supports many algorithms, and is able to compare blocks of code of executable files.
rafind2 - Utility for searching byte sequence.
ragg2 - A utility for compiling small programs.
rarun2 - A utility that can run the analyzed program with different environment settings.
rax2 - A small calculator that allows you to perform simple calculations in various number systems.

The main disadvantage that prevents the prevalence of the framework is the lack of a quality GUI. There are third-party implementations, but unfortunately they are not very convenient. Also worth noting is the presence of a built-in web interface.

Radare2 is most often used as a reverse engineering tool, as an advanced disassembler. We will consider Radare2 exactly as a disassembler and analyze simple crackme.

Introduction to the assembler

Before starting the analysis of the program it is worthwhile to dwell on the main points that are necessary to understand the assembly code. The description of the basic instructions of the assembler deserves a separate article, therefore only the main groups of instructions will be given here.

Copy instructions (mov, movsx, movzx)
Logical operations instructions (and, or, xor, test)
Instructions for arithmetic operations (add, sub)
Instructions for managing the sequence of program execution (jmp, jne, ret)
Interrupt Instructions (int)
I / O instructions (in, out)

By default, Radare2 uses the intel syntax, which is characterized by the following recording format:

Basic instructions can have one or two operands. In the case of working with two operands, the recording format will take the following form:

   1, 2 ;

Many instructions, such as and, sub, add, save the result of the calculation in the first operand.

Assembly language does not support operations in which both operands are in memory. Therefore, it is necessary to put one or both values into registers, which will later be used as operands. Thus, we smoothly approached the definition of registers.

Registers are very fast memory cells that are located in the processor. They work much faster than RAM or cache, but the amount of memory stored in them is very small. The x86 (x86-32) architecture processor has 8 general registers with a size of 32 bits. Architecture processors amd64 (x86-64) have 16 general-purpose registers of 64 bits in size. More detailed information is presented in the table below.

Explore crackme

Let us analyze the analysis of executable files on the example of the simplest crackme, obtained from here https://github.com/geyslan/crackmes . Run the program and look at its behavior. We immediately see an invitation to enter a password, try entering 123456.

We did not guess the password, the program asks to try again and completes its work. Let's start the analysis, for this we will launch the radar with the command “r2 -A crackme”. The -A argument is needed so that the radar can immediately analyze the functions, equivalent to the aa command. Using the izz command, we display text strings that are contained in the program.

Here we see a few lines, two we have already met during the launch of the program. We also see a string which, presumably, is displayed in the case of entering the correct password. This line is stored at 0x08048888, remember this address.

Run the afl command to get the list of functions.

Here we see in addition to library functions, as well as the function entry0 which, as the name implies, is the entry point of the program. The main function is the starting point for the execution of all programs written in C / C ++. From the names of the other functions it is difficult to conclude about the role in the program

Let's see the main function code by running pdf @ main. Here we see several function calls. The first call is the fwrite function, which prints the prompt string. Second, the fgets function reads from the input device and puts the entered data into memory. Next comes the call of two functions of unknown purpose. Then two more calls to fwrite. We are interested in the section of the code in which the address of the line that we memorized earlier is accessed.

Here we see that the line will be output if the conditional transition “jne 0x804875e” does not occur, for this, at the time of executing “test eax, eax”, the value of the eax register must be equal to 0. We can assume that the function fcn.08048675 performed earlier , checks the password, and if the password is correct, writes to eax 0. Therefore, if you remove the conditional jump, the program, regardless of the password entered, will assume that the correct password is entered. This can be done in various ways, for example, before checking, to force the value of the eax register to 0. Change the transition address or simply remove the transition by replacing it with the op codes nop.

We will try the last option, to do this, we will re-open the file in recording mode by executing the oo + command. Then we go to the address 0x08048735 and execute the command "wa nop; nop". As a result, we have replaced the conditional transition to two opcodes nop.

Run the program and try to enter the password.

Great, we successfully patched the program. In the case of a more complex program, such a solution may not work out entirely correctly, and as a result, the program may not behave at all as expected. You can go a more complicated way and find out the correct password, for this you need to analyze the functions fcn.08048675 and fcn.08048642. Let's start with fcn.08048642, run pdf @ fcn.08048642.

After analyzing the code, we see that the function takes two arguments, although one of them is not used. In the function body, a loop with a counter is executed. mov dword [local_4h], 0 initializes the counter with value 0. Next, an unconditional transition to the address 0x0804866d is performed, where the counter is compared with the value 5. If the counter value is less than 5, then the transition to the address 0x08048651 is performed. Here, the value of the counter is written to the edx register, then the value of the second argument is written to the eax register, most likely it is a pointer to the string we entered. Further, the values of these registers are added; as a result, we get an address with a counter offset, relative to the pointer to our string.

The result of the addition is stored in the edx register. Then a similar action is performed, only the result is saved in eax. In the next line, the movzx operand copies the byte pointed to by the address in eax in the lower part of this register, al. After this, an exclusive operation is performed or, between the byte in the eax register and 0x6c. The result is recorded at the address that is stored in edx. Then 1 is added to the counter. If the counter is less than 5, the cycle repeats.

After the counter takes the value 5, the loop is exited and the function ends. Thus, the string we entered is modified and each character in it is changed. Based on the maximum value of the counter, we conclude that the password consists of 6 bytes.

Next, the function fcn.08048675 is called, which takes 2 arguments, the address of the converted password entered by us, and the address 0x8049b60, we call them string 1 and string 2, and the addresses for the characters inside them, respectively, pointer 1 and pointer 2. This function consists of a cycle within which several checks are made. At the beginning of the loop iteration, a pointer is written to string 1 in eax, then edx records the value of the pointer to. The same is repeated for line 2, only the value is written in eax. Then the low bytes of these registers are compared.

If the values are not the same, the loop is exited and the transition to the address 0x0804867a, where the values of the bytes referenced by both pointers are checked for a zero value. If both bytes have a non-zero value, the pointers are incremented by 1. If the bytes are not equal or one of them is 0, the code at address 0x080486b0 is executed, in which the value is checked using the pointer two. If the value is 0, then the eax register is written 0, otherwise 0xffffffff or -1. Next is the exit from the function.

As you can see, this function simply compares two strings and if they are the same, returns 0, otherwise -1. We can also conclude that the correct password is stored at 0x8049b60. As we learned earlier, its length is 6 bytes, we will read it.

Let's try to do the inverse transformation of the first character, for this by running the command "? 1b ^ 0x6c "and get the first character" w ".

As a result, we get the string whyn0t. Check it out by replacing the patched version with the original one.

The password is correct, we have successfully solved this crackme.

Source: https://habr.com/ru/post/339264/

All Articles

Introduction to reverse engineering with Radare2

Introduction to the assembler

Explore crackme

More articles: