X86 assembler guide for beginners

Nowadays, it is rarely necessary to write in pure assembler, but I definitely recommend it to anyone interested in programming. You will see things from a different angle, and the skills will be useful when debugging code in other languages.

In this article we will write from scratch the reverse Polish record calculator (RPN) in pure x86 assembler. When done, we can use it like this:

$ ./calc "32+6*" # "(3+2)*6"    30

All code for the article is here . It is abundantly commented out and can serve as training material for those who already know the assembler.
')
Let's start by writing the basic program Hello world! to check the environment settings. Then we go to the system calls, call stack, stack frames, and x86 calling convention. Then, for the practice, we will write some basic functions in x86 assembler - and we will start writing the RPN calculator.

It is assumed that the reader has some C programming experience and basic knowledge of computer architecture (for example, what is a processor register). Since we will use Linux, you should also be able to use the Linux command line.

Environment setup

As already said, we use Linux (64- or 32-bit). The code above does not work in Windows or Mac OS X.

For installation, you only need the GNU ld linker from binutils , which is preinstalled on most distributions, and the NASM assembler. On Ubuntu and Debian, you can install both with one command:

 $ sudo apt-get install binutils nasm

I would also recommend keeping the ASCII table handy.

Hello, world!

To test your environment, save the following code in the calc.asm file:

 ;    _start     ; . global _start ;   .rodata   (  ) ;     ,       section .rodata ;     hello_world.   NASM ;   ,     , ;  . 0xA =  , 0x0 =    hello_world: db "Hello world!", 0xA, 0x0 ;   .text,     section .text _start: mov eax, 0x04 ;   4   eax (0x04 = write()) mov ebx, 0x1 ;   (1 =  , 2 =  ) mov ecx, hello_world ;     mov edx, 14 ;   int 0x80 ;    0x80,   ;     mov eax, 0x01 ; 0x01 = exit() mov ebx, 0 ; 0 =   int 0x80

Comments explain the general structure. For a list of registers and general instructions, see the University of Virginia's x86 Assembler Guide . In further discussion of system calls, this is all the more necessary.

The following commands assemble the assembler file into an object file, and then compose the executable file:

 $ nasm -f elf_i386 calc.asm -o calc $ ld -m elf_i386 calc.o -o calc

After launch, you should see:

 $ ./calc Hello world!

Makefile

This is an optional part, but to simplify the build and layout in the future, you can make a Makefile . Save it in the same directory as calc.asm :

 CFLAGS= -f elf32 LFLAGS= -m elf_i386 all: calc calc: calc.o ld $(LFLAGS) calc.o -o calc calc.o: calc.asm nasm $(CFLAGS) calc.asm -o calc.o clean: rm -f calc.o calc .INTERMEDIATE: calc.o

Then, instead of the above instructions, just run make.

System calls

Linux system calls tell the OS to do something for us. In this article, we use only two system calls: write() to write a string to a file or stream (in our case, this is standard output and standard error) and exit() to exit the program:

 syscall 0x01: exit(int error_code) error_code -  0         (  1)   syscall 0x04: write(int fd, char *string, int length) fd —  1   , 2      string —      length —

System calls are configured by storing the system call number in the eax register, and then its arguments in ebx , ecx , edx in that order. You may notice that exit() only one argument — in this case, ecx and edx do not matter.

eax	ebx	ecx	edx
System call number	arg1	arg2	arg3

Call stack

The call stack is a data structure that stores information about each function call. Each call has its own section in the stack - “frame”. It stores some information about the current call: the local variables of this function and the return address (where the program should go after the function is executed).

Immediately I will note one non-obvious thing: the stack grows down through memory. When you add something to the top of the stack, it is inserted at a memory address lower than the previous item. In other words, as the stack grows, the memory address at the top of the stack decreases. To avoid confusion, I will remind you of this fact all the time.

The push instruction puts something on top of the stack, and pop takes data away from there. For example, push allocates space at the top of the stack and places the value in the eax register, and pop transfers any data from the upper part of the stack to eax and frees this area of memory.

The purpose of the esp register is to point to the top of the stack. Any data above esp considered to be not on the stack, this is garbage data. Executing a push (or pop ) instruction moves esp . You can manipulate esp directly if you give a report to your actions.

The ebp register is similar to esp , only it always indicates approximately the middle of the current stack frame, immediately before the local variables of the current function (let's talk about this later). However, calling another function does not automatically move ebp ; you need to do this manually each time.

Call Agreement for x86 Architecture

In x86, there is no built-in notion of a function as in high-level languages. The call instruction is essentially just jmp ( goto ) to another memory address. To use subroutines as functions in other languages (which can take arguments and return data), you need to follow the calling convention (there are many conventions, but we use CDECL, the most popular agreement for x86 among C compilers and assembler programmers). It also ensures that the subroutine registers are not confused when calling another function.

Caller Rules

Before calling a function, the caller must:

Save the registers that the caller must save to the stack. The called function may change some registers: in order not to lose the data, the caller must keep them in memory before being pushed onto the stack. These are the eax , ecx and edx . If you do not use any of them, then you can not save.
Write the function arguments to the stack in reverse order (first the last argument, at the end the first argument). This order ensures that the function being called gets its arguments in the correct order from the stack.
Call a subroutine.

If possible, the function will save the result in eax . Immediately after the call caller must:

Remove function arguments from stack. This is usually done by simply adding the number of bytes in esp . Do not forget that the stack grows down, so you need to add bytes to remove from the stack.
To restore the saved registers, taking them from the stack in the reverse order with the pop instruction. The called function will not change any other registers.

The following example demonstrates how these rules are applied. Suppose that the _subtract function takes two integer (4-byte) arguments and returns the first argument minus the second. In the _mysubroutine subroutine _mysubroutine call _subtract with arguments 10 and 2 :

 _mysubroutine: ; ... ;  -  ; ... push ecx ;   (    eax) push edx push 2 ;  ,      push 10 call _subtract ; eax   10-2=8 add esp, 8 ;  8    (   4 ) pop edx ;    pop ecx ; ... ;  - ,        eax ; ...

Called Subroutine Rules

Before calling the subroutine must:

Save the ebp base register pointer of the previous frame by writing it to the stack.
Adjust ebp from previous frame to current frame (current esp value).
Allocate more stack space for local variables; if necessary, move the esp pointer. As the stack grows down, you need to subtract the missing memory from esp .
Save to the stack the registers of the called subroutine. These are ebx , edi and esi . It is not necessary to save registers that are not planned to change.

Call stack after step 1:

Call stack after step 2:

Call stack after step 4:

On these diagrams, the return address is indicated in each stack frame. It is automatically inserted into the stack call instruction. The ret instruction retrieves the address from the top of the stack and goes to it. We do not need this instruction, I just showed why the local variables of the function are 4 bytes higher than ebp , but the function arguments are 8 bytes lower than ebp .

The last diagram also shows that local variables of a function always begin 4 bytes above ebp from the address ebp-4 (here subtraction, because we move up the stack), and the function arguments always begin 8 bytes below ebp from the address ebp+8 (addition, because we are moving down the stack). If you follow the rules of this convention, so will c variables and arguments of any function.

When the function is completed and you want to return, you must first set eax to the return value of the function, if necessary. In addition, you need:

Restore saved registers by removing them from the stack in reverse order.
Free space on the stack allocated to local variables in step 3, if necessary: done by simply setting esp in ebp
Restore the previous frame's ebp base pointer, removing it from the stack.
Return with ret

Now let's implement the _subtract function from our example:

 _subtract: push ebp ;      mov ebp, esp ;  ebp ;          ,      ;       ,     ;   ;    mov eax, [ebp+8] ;      eax.  ;       ebp+8 sub eax, [ebp+12] ;      ebp+12   ;  ;   , eax     ;     ,     ;       ,       pop ebp ;      ret

entrance and exit

In the example above, you may notice that the function always starts the same way: push ebp , mov ebp , esp and memory allocation for local variables. In the x86 set there is a convenient instruction that does all this: enter ab , where a is the number of bytes that you want to allocate for local variables, b is the “nesting level”, which we will always set to 0 . In addition, the function always ends with the pop ebp and mov esp , ebp instructions (although they are necessary only when allocating memory for local variables, but in any case they do no harm). This can also be replaced with a single instruction: leave . Make changes:

 _subtract: enter 0, 0 ;        ebp ;       ,     ;   ;    mov eax, [ebp+8] ;      eax.  ;       ebp+8 sub eax, [ebp+12] ;      ebp+12  ;   ;   , eax     ;     ,     leave ;      ret

Writing some basic functions

Having mastered the calling convention, you can begin writing some subroutines. Why not summarize the code that outputs "Hello world!" To output any lines: the _print_msg function.

Here we need another function _strlen to calculate the length of the string. In C, it may look like this:

 size_t strlen(char *s) { size_t length = 0; while (*s != 0) { //   length++; s++; } //   return length; }

In other words, from the very beginning of the line, we add 1 to the return value for each character except zero. As soon as the null character is noticed, we return the value accumulated in the loop. In assembly language, this is also quite simple: you can use the previously written _subtract function as a base:

 _strlen: enter 0, 0 ;        ebp ;       ,     ;   ;    mov eax, 0 ; length = 0 mov ecx, [ebp+8] ;    (   ;  )   ecx (   ; ,      ) _strlen_loop_start: ;  ,    cmp byte [ecx], 0 ;       .  ;     32  (4 ). ;    .    ;     ( ) je _strlen_loop_end ;       inc eax ;    ,  1    add ecx, 1 ;       jmp _strlen_loop_start ;      _strlen_loop_end: ;   , eax    ;     ,     leave ;      ret

Already not bad, right? First, writing C code can help, because most of it is directly converted to assembler. Now you can use this function in _print_msg , where we apply all the knowledge gained:

 _print_msg: enter 0, 0 ;    mov eax, 0x04 ; 0x04 =   write() mov ebx, 0x1 ; 0x1 =   mov ecx, [ebp+8] ;       , ;   edx   .    _strlen push eax ;     (    edx) push ecx push dword [ebp+8] ;   _strlen  _print_msg.  NASM ; ,    ,  , . ;      dword (4 , 32 ) call _strlen ; eax     mov edx, eax ;     edx,     add esp, 4 ;  4    ( 4-  char*) pop ecx ;     pop eax ;      _strlen,     int 0x80 leave ret

And let's see the fruits of our hard work using this feature in the full program “Hello, world!”.

 _start: enter 0, 0 ;     (    ) push hello_world ;    _print_msg call _print_msg mov eax, 0x01 ; 0x01 = exit() mov ebx, 0 ; 0 =   int 0x80

Believe it or not, we have covered all the main topics that are needed for writing basic programs in x86 assembler! Now we have all the introductory material and theory, so we’ll concentrate entirely on the code and apply this knowledge to write our RPN calculator. Functions will be much longer and will even use some local variables. If you want to immediately see the finished program, here it is .

For those of you who are not familiar with reverse polish notation (sometimes called reverse polish notation or postfix notation), here the expressions are calculated using the stack. Therefore, you need to create a stack, as well as _pop and _push for manipulating this stack. You will _print_answer another _print_answer function, which will display at the end of the calculations a string representation of the numerical result.

Stack creation

First, we define the space in memory for our stack, as well as the global variable stack_size . It is desirable to change these variables so that they fall not into the .rodata section, but into .data .

 section .data stack_size: dd 0 ;   dword (4 )   0 stack: times 256 dd 0 ;

Now you can implement the _push and _pop :

 _push: enter 0, 0 ;    ,    push eax push edx mov eax, [stack_size] mov edx, [ebp+8] mov [stack + 4*eax], edx ;    .   ;       dword inc dword [stack_size] ;  1  stack_size ;     pop edx pop eax leave ret _pop: enter 0, 0 ;     dec dword [stack_size] ;   1  stack_size mov eax, [stack_size] mov eax, [stack + 4*eax] ;       eax ;     ,     leave ret

Output numbers

_print_answer much more difficult: you have to convert numbers to strings and use several other functions. You will _putc function, which outputs one character, the mod function to calculate the remainder of the division (module) of two arguments and _pow_10 for raising to the power of 10. Later, you will understand why they are needed. It's pretty simple, here's the code:

 _pow_10: enter 0, 0 mov ecx, [ebp+8] ;  ecx (  )  ;  mov eax, 1 ;   10 (10**0 = 1) _pow_10_loop_start: ;  eax  10,  ecx   0 cmp ecx, 0 je _pow_10_loop_end imul eax, 10 sub ecx, 1 jmp _pow_10_loop_start _pow_10_loop_end: leave ret _mod: enter 0, 0 push ebx mov edx, 0 ;   mov eax, [ebp+8] mov ebx, [ebp+12] idiv ebx ;  64-  [edx:eax]  ebx.    ;  32-  eax,    edx  ; . ;    eax,   edx.  ,  ;       , ;    . mov eax, edx ;     () pop ebx leave ret _putc: enter 0, 0 mov eax, 0x04 ; write() mov ebx, 1 ;   lea ecx, [ebp+8] ;   mov edx, 1 ;   1  int 0x80 leave ret

So, how do we print individual numbers in a number? First, note that the last digit of the number is the remainder of dividing by 10 (for example, 123 % 10 = 3 ), and the next digit is the remainder of dividing by 100 divided by 10 (for example, (123 % 100)/10 = 2 ). In general, you can find a specific digit of the number (from right to left) by finding ( % 10**n) / 10**(n-1) , where the number of units will be n = 1 , the number of tens of n = 2 and so on.

Using this knowledge, one can find all digits of a number with n = 1 to n = 10 (this is the maximum number of digits in the significant 4-byte whole). But it is much easier to go from left to right - so we can type each character as soon as we find it, and get rid of the zeros on the left side. Therefore, we look through the numbers from n = 10 to n = 1 .

On C, the program will look something like this:

 #define MAX_DIGITS 10 void print_answer(int a) { if (a < 0) { //    putc('-'); //   «» a = -a; //     } int started = 0; for (int i = MAX_DIGITS; i > 0; i--) { int digit = (a % pow_10(i)) / pow_10(i-1); if (digit == 0 && started == 0) continue; //     started = 1; putc(digit + '0'); } }

, . :

 %define MAX_DIGITS 10 _print_answer: enter 1, 0 ;  1    "started"   C push ebx push edi push esi mov eax, [ebp+8] ;   "a" cmp eax, 0 ;    ,    ;  jge _print_answer_negate_end ; call putc for '-' push eax push 0x2d ;  '-' call _putc add esp, 4 pop eax neg eax ;     _print_answer_negate_end: mov byte [ebp-4], 0 ; started = 0 mov ecx, MAX_DIGITS ;  i _print_answer_loop_start: cmp ecx, 0 je _print_answer_loop_end ;  pow_10  ecx.   ebx   "digit"   C. ;    edx = pow_10(i-1),  ebx = pow_10(i) push eax push ecx dec ecx ; i-1 push ecx ;    _pow_10 call _pow_10 mov edx, eax ; edx = pow_10(i-1) add esp, 4 pop ecx ;   i  ecx pop eax ; end pow_10 call mov ebx, edx ; digit = ebx = pow_10(i-1) imul ebx, 10 ; digit = ebx = pow_10(i) ;  _mod  (a % pow_10(i)),   (eax mod ebx) push eax push ecx push edx push ebx ; arg2, ebx = digit = pow_10(i) push eax ; arg1, eax = a call _mod mov ebx, eax ; digit = ebx = a % pow_10(i+1), almost there add esp, 8 pop edx pop ecx pop eax ;   mod ;  ebx ( "digit" )  pow_10(i) (edx).    ; ,   idiv     edx, eax.  ; edx   ,    - ;   push esi mov esi, edx push eax mov eax, ebx mov edx, 0 idiv esi ; eax   () mov ebx, eax ; ebx = (a % pow_10(i)) / pow_10(i-1),  "digit"   C pop eax pop esi ; end division cmp ebx, 0 ;  digit == 0 jne _print_answer_trailing_zeroes_check_end cmp byte [ebp-4], 0 ;  started == 0 jne _print_answer_trailing_zeroes_check_end jmp _print_answer_loop_continue ; continue _print_answer_trailing_zeroes_check_end: mov byte [ebp-4], 1 ; started = 1 add ebx, 0x30 ; digit + '0' ;  putc push eax push ecx push edx push ebx call _putc add esp, 4 pop edx pop ecx pop eax ;   putc _print_answer_loop_continue: sub ecx, 1 jmp _print_answer_loop_start _print_answer_loop_end: pop esi pop edi pop ebx leave ret

! , . : « printf("%d") ?», , !

, _start — !

, . , .

, 84/3+6* ( 6384/+* ), :

	Symbol
one	`8`	`[]`	`[8]`
2	`4`	`[8]`	`[8, 4]`
3	`/`	`[8, 4]`	`[2]`
four	`3`	`[2]`	`[2, 3]`
five	`+`	`[2, 3]`	`[5]`
6	`6`	`[5]`	`[5, 6]`
7	`*`	`[5, 6]`	`[30]`

, — , . 30.

C:

 int stack[256]; // , 256      int stack_size = 0; int main(int argc, char *argv[]) { char *input = argv[0]; size_t input_length = strlen(input); for (int i = 0; i < input_length; i++) { char c = input[i]; if (c >= '0' && c <= '9') { //   —   push(c - '0'); //          } else { int b = pop(); int a = pop(); if (c == '+') { push(a+b); } else if (c == '-') { push(ab); } else if (c == '*') { push(a*b); } else if (c == '/') { push(a/b); } else { error("Invalid input\n"); exit(1); } } } if (stack_size != 1) { error("Invalid input\n"); exit(1); } print_answer(stack[0]); exit(0); }

, , .

 _start: ;  _start   ,    . ;   esp    argc ( ),  ; esp+4   argv. , esp+4    ; , esp+8 -       mov esi, [esp+8] ; esi = "input" = argv[0] ;  _strlen      push esi call _strlen mov ebx, eax ; ebx = input_length add esp, 4 ; end _strlen call mov ecx, 0 ; ecx = "i" _main_loop_start: cmp ecx, ebx ;  (i >= input_length) jge _main_loop_end mov edx, 0 mov dl, [esi + ecx] ;          ; edx.   edx . ; edx =  c = input[i] cmp edx, '0' jl _check_operator cmp edx, '9' jg _print_error sub edx, '0' mov eax, edx ; eax =  c - '0' (,  ) jmp _push_eax_and_continue _check_operator: ;   _pop    b  edi, a  b -  eax push ecx push ebx call _pop mov edi, eax ; edi = b call _pop ; eax = a pop ebx pop ecx ; end call _pop cmp edx, '+' jne _subtract add eax, edi ; eax = a+b jmp _push_eax_and_continue _subtract: cmp edx, '-' jne _multiply sub eax, edi ; eax = ab jmp _push_eax_and_continue _multiply: cmp edx, '*' jne _divide imul eax, edi ; eax = a*b jmp _push_eax_and_continue _divide: cmp edx, '/' jne _print_error push edx ;  edx,      idiv mov edx, 0 idiv edi ; eax = a/b pop edx ;   eax     _push_eax_and_continue: ;  _push push eax push ecx push edx push eax ;   call _push add esp, 4 pop edx pop ecx pop eax ;  call _push inc ecx jmp _main_loop_start _main_loop_end: cmp byte [stack_size], 1 ;  (stack_size != 1),   jne _print_error mov eax, [stack] push eax call _print_answer ; print a final newline push 0xA call _putc ; exit successfully mov eax, 0x01 ; 0x01 = exit() mov ebx, 0 ; 0 =   int 0x80 ;    _print_error: push error_msg call _print_msg mov eax, 0x01 mov ebx, 1 int 0x80

error_msg .rodata :

 section .rodata ;     error_msg.  db  NASM ;    ,     ; . 0xA =  , 0x0 =    error_msg: db "Invalid input", 0xA, 0x0

! , . , , , , , RollerCoaster Tycoon!

. ! , .

, :

segfault , .
.
.
.
_strlen C , _print_answer printf .

« x86 » — , , x86.
« Intel» . x86 — , . , , .
NASM: Intel x86 Instruction Reference — x86.

Source: https://habr.com/ru/post/423077/

All Articles