We comprehend C deeper using assembler

This article was inspired by: We understand in C, studying assembler . The continuation did not work out, although the topic is interesting. Many would like to write code and understand how it works. Therefore, I will launch a series of articles on how the C code looks after decompilation, simultaneously analyzing the basic code structures.

From the reader will need at least a basic knowledge of the following things:

processor registers
stack
representation of numbers in the computer
assembler and C syntax

But if you do not have them, and the topic is interesting to you, then all this can be quickly googled in the process of reading the article. The article is not intended for novices at all, but I diligently chewed on many simple things so that newcomers could start from something.

What will we use?

We need a C compiler that supports the modern standard. You can use the online compiler on the site ideone.com .
We also need a decompiler, again, you can use the online decompiler on godbolt.org .
You can also take a compiler for the assembler, which is on ideone at the link above.

Why do we have everything online? Because it is convenient to resolve disputes due to different versions and operating systems. There are a lot of compilers, there are also enough decompilers, I would not like to take into account the peculiarities of each in the discussion.

')

With a more thorough approach to the study, it is better to use offline versions of compilers, you can take a bunch of current gcc, OllyDbg and NASM. Differences should be minimal.

Simplest program

This article does not seek to repeat the one I quoted at the very beginning. But you need to start from scratch, so some of the material will be forced to intersect. Hope for understanding.

The first thing you need to learn, the compiler, even when optimizing the zero level (-O0), can cut the code written by the programmer. Therefore, the following code:

int main(void) { 5 + 3; return 0; }

Nothing will differ from:

 int main(void) { return 0; }

Therefore, it will be necessary to write in such a way that during decompilation we, nevertheless, will see the transformation of our code into something meaningful, so the examples may look at least strange.

Second, we need compilation flags. Two is enough: -O0 and -m32 . By this we set the zero optimization level and 32-bit mode. With optimizations, it should be obvious: we don’t want to see the interpretation of our code in asm, but not optimized. With the regime should also be obvious: fewer registers - more attention to the point. Although these flags, I will periodically change to go into the material.

Thus, if you are using gcc, the compilation may look like this:

gcc source.c -O0 -m32 -o source

Accordingly, if you are using godbolt, then you need to specify these flags in the input line next to the compiler selection. (I show the first examples on gcc 4.4.7, then change it to a later one)

Now, you can see the first example:

 int main(void) { register int a = 1; //    1 return a; //     }

So, the following code corresponds to this:

 push ebp mov ebp, esp push ebx mov ebx, 1 mov eax, ebx pop ebx pop ebp ret

The first two lines correspond to the function prologue (or rather three, but the third one I want to clarify now), and we will analyze them in the article on functions. Now just do not pay attention to them, the same applies to the last 3 lines. If you don't know asm, let's see what these commands mean.

Assembly instructions are:

mnemonic dst, src

i.e.

instructions recipient source

Here you need to make a reservation that the AT & T syntax has a different order, and then we will come back to it, but now we are interested in a syntax similar to NASM.

Let's start with the mov instruction. This instruction moves from memory to registers or from registers to memory. In our case, it moves the number 1 to the ebx register.

Let's take a quick look at the registers: in the x86 architecture, there are eight 32-bit general purpose registers, which means that these registers can be used by the programmer (in our case, the compiler) when writing programs. The ebp, esp, esi and edi registers will be used by the compiler in special cases, which we will discuss later, and the eax, ebx, ecx, and edx registers will be used by the compiler for all other needs.

Thus, mov ebx, 1 , directly corresponds to the string register int a = 1;

And it means that the value 1 has been moved to the ebx register.

And the line mov eax, ebx , will mean that the value from the ebx register will be moved to the eax register.

There are two more lines push ebx and pop ebx . If you are familiar with the concept of "stack", then you guess that the compiler first put ebx on the stack, thus remembered the old register value, and after the end of the program, returned the value from the stack back to the ebx register.

Why does the compiler place the value 1 from the ebx register into eax? This is due to the convention about calls to C functions. There are several points there, all of them are not interested in us now. It is important that the result is returned in eax, if possible. Thus, it is clear why the unit eventually ends up in eax.

But now the logical question is, why did you need ebx? Why it was impossible to write immediately mov eax, 1 ? It's all about the level of optimization. I told you: the compiler should not cut our code, and we did not write return 1 , we used a register variable. That is, the compiler first placed the value in the register, and then, following the agreement, returned the result. Change the optimization level to any other, and you will see that the ebx register is really not needed.

By the way, if you use godbolt, then you can hover over a line in C, and you will highlight the code corresponding to this line in asm, provided that this line is highlighted in color.

Stack

Let's complicate the example and stop using the register variables (do you use them infrequently?). Let's see what this code will become:

 int main(void) { int a = 1; //   1 int b = a + 5; //  'a' 5    'b' return b; //    }

ASM:

 push ebp mov ebp, esp sub esp, 16 mov DWORD PTR [ebp-8], 1 mov eax, DWORD PTR [ebp-8] add eax, 5 mov DWORD PTR [ebp-4], eax mov eax, DWORD PTR [ebp-4] leave ret

Again, let's skip the top 3 lines and the bottom 2. Now we have a variable, a local one, so the memory is allocated to it on the stack. Therefore, we see the following magic: DWORD PTR [ebp-8] , what does it mean? DWORD PTR is a double word type variable. The word is 16 bits. The term became widespread in the era of 16-bit processors, then exactly 16 bits were placed in the register. This amount of information began to be called the word (word). That is, in our case, dword (double word) 2 * 16 = 32 bits = 4 bytes (normal int).

The ebp register contains the address at the top of the stack for the current function (we will come back to this later), so it is shifted by 4 bytes so as not to overwrite the address itself and appends the value of our variable. Only, in our case, it is offset by 8 bytes for the variable a . But if you look at the code below, you will see that the variable b lies with an offset of 4 bytes. Brackets mean address. That is, this line works as follows: on the basis of the address stored in ebp, the compiler places the value 1 at the address ebp-8 of size 4 bytes. Why is minus eight, not plus. Because the parameters passed to this function would correspond to the plus, but again, we will discuss this later.

The next line moves the value 1 to the eax register. I think it does not need detailed explanations.

Next we have a new add instruction, which adds (adds). That is, 5 is added to the value in eax (1), now the value 6 is found in eax.

After that, it is necessary to move the value 6 to the variable b , which is done by the next line (the variable b is on the stack at offset 4).

Finally, we need to return the value of the variable b, therefore we need to move

value in eax register ( mov eax, DWORD PTR [ebp-4] ).

If everything is clear with the previous one, then it is possible to move on to the more complex.

Interesting and not so obvious things.

What happens if we write the following: int var = 2.5;

Each of you, I think, will answer correctly, that in var there will be a value of 2. But what will happen to the fractional part? It is discarded, ignored, will there be a type conversion? Let's get a look:

ASM:

 mov DWORD PTR [ebp-4], 2

The compiler himself dropped the fractional part as superfluous.

What happens if you write like this: int var = 2 + 3;

ASM:

 mov DWORD PTR [ebp-4], 5

And we learn that the compiler itself is able to calculate constants. And in this case: since 2 and 3 are constants, their sum can be calculated at the compilation stage. Therefore, you can not bother with calculating such constants, the compiler can do the work for you. For example, the translation into seconds from hours can be written as hours * 60 * 60. But rather, as an example, it is worthwhile to put operations on constants that are declared in the code.

What happens if we write this code:

 int a = 1; int b = a * 2;

 mov DWORD PTR [ebp-8], 1 mov eax, DWORD PTR [ebp-8] add eax, eax mov DWORD PTR [ebp-4], eax

Interesting, isn't it? The compiler decided not to use the multiplication operation, but simply added two numbers, which is - multiply by 2. (I will not describe these lines in detail, you must understand them, based on the previous material)

You may have heard that the multiplication operation takes longer than the addition operation. It is for these reasons that the compiler optimizes such simple things.

But let's complicate the task and write it like this:

 int a = 1; int b = a * 3;

ASM

 mov DWORD PTR [ebp-8], 1 mov edx, DWORD PTR [ebp-8] mov eax, edx add eax, eax add eax, edx mov DWORD PTR [ebp-4], eax

Do not be fooled by the use of the new register edx, it is no worse than eax or ebx. It may take time, but you should see that the unit is in the edx register, then in the eax register, after which the value of eax is added to itself and then another unit is added from edx. So we got 1 + 1 + 1.

You know, infinitely, he will not do that, already at * 4, the compiler will produce the following:

 mov DWORD PTR [ebp-8], 1 mov eax, DWORD PTR [ebp-8] sal eax, 2 mov DWORD PTR [ebp-4], eax mov eax, 0

So we have a new sal instruction, what does it do? This is a binary left shift. Equivalent to the following code in C:

 int a = 1; int b = a << 2;

For those who do not really understand how this statement works:

0001 is shifted to the left (or added to the right) by two zeros: 0100 (i.e. 4 in the 10th number system). At its core, a shift to the left by 2 digits is a multiplication by 4.

It's funny that if you multiply by 5, the compiler will make one sal and one add, you can test different numbers yourself.

At 22, the compiler on godbolt.org gives up and uses multiplication, but before that number it tries to get out in a variety of ways. Even subtraction uses and some more instructions that we have not yet discussed.

Well, it was flowers, and what do you think about the following code:

 int a = 2; int b = a / 2;

If you expect subtraction, then alas - no. The compiler will produce more sophisticated methods. The “division” operation is even slower than multiplication, so the compiler will also get out:

 mov DWORD PTR [ebp-4], 2 mov eax, DWORD PTR [ebp-4] mov edx, eax shr edx, 31 add eax, edx sar eax mov DWORD PTR [ebp-8], eax

It should be noted that for this code I chose the compiler of a significantly later version (gcc 7.2), before that I cited gcc 4.4.7 as an example. For early examples there were no significant differences, for this example they use different instructions in the fifth line of the code. And the example generated by 7.2 is easier for me to explain to you now.

It is worth noting that now the variable a is on the stack at offset 4, not 8, and immediately forget about this minor difference. Key points begin with mov edx, eax . But for now, skip the value of this line. The shr instruction performs a binary right shift (i.e., a division by 2 if it were shr edx, 1 ). And here some will be able to think, and why, indeed, not to write shr edx, 1 , is that what the code does in C? But not everything is so simple.

Let's make a small optimization and see what it affects. In fact, we use integer division with our code. Since the variable “a” is an integer type and 2 is a constant of type int, the result cannot be obtained in any way fractional by the logic C. And this is good, since dividing integers is faster and simpler, but we have signed numbers, which means that a negative number when dividing with the shr instruction may differ by one from the correct answer. (This is all due to the fact that 0 fits in the middle of the range for sign types). If we replace the sign division with unsigned:

 unsigned int a = 2; unsigned int b = a / 2;

Then we get the expected. It is worth considering that godbolt will drop the unit in the shr instruction, and this will not compile into NASM, but it is implied. Change 2 to 4 and you will see the second operand as 2.

Now look at the previous code. In it we see sar eax , which is the same as shr, only for signed numbers. The rest of the code just takes this unit into account when we divide a negative number (or a negative number, although the code changes a little). If you know how negative numbers are represented in a computer, it will not be difficult for you to guess why we shift rightward by 31 bits and add this value to the original number.

With the division into large numbers, it is still easier. There, the division is replaced by multiplication, the constant is calculated as the second operand. If you are wondering how, you can break your head over it yourself, there is nothing difficult there. You just need to understand how real numbers are represented in memory.

Conclusion

For the first article there is more than enough material. It's time to round out and sum up. We got acquainted with the basic syntax of the assembler, found that the compiler can take on the simplest optimization when calculating. Saw a difference between register and stack variables. And some other things. It was an introductory article, I had to spend a lot of time on obvious things, but they are not obvious to everyone, in the future we will comprehend more subtleties of the C language.

Part 2

Source: https://habr.com/ru/post/344896/

All Articles