Learning MIPS assembler

As Wikipedia says, MIPS is a microprocessor developed by MIPS Computer Systems (now MIPS Technologies) and first implemented in 1985. There are a large number of modifications of this architecture, created specifically for 3D modeling, fast processing of floating-point numbers, and multi-threaded calculations. Various variants of these processors are used in Cisco and Mikrotik routers, smartphones, tablets and game consoles.

MIPS instructions are simple enough to understand, and it is recommended to begin learning assembler. What now, in fact, do.

Program structure in MIPS assembler

Here is the classic MIPS assembler program.
Everything that starts at a dot is a directive . The directive .data means the beginning of the data segment, .text - the beginning of the code segment.
Everything followed by a colon is a label ( v: main: loop: and endloop: .
All text following the # sign is comments .
And what remains is, in fact, instructions and pseudoinstructions (macros).
')

 .data v: .word -1, -2, -3, -4, -5, -6, -7, -8, -9, -10 .text .globl main main: li $t0, 0 # $t0 = 0 (variable a) li $t1, 0 # $t1 = 0 (counter i) li $t2, 10 # $t2 = 10 (count limit l) loop: slt $t3, $t1, $t2 beq $t3, $zero, endloop la $t3, V sll $t4, $t1, 2 addu $t3, $t3, $t4 lw $t3, 0($t3) addu $t0, $t0, $t3 addiu $t1, $t1, 1 b loop endloop:

Types in MIPS-assembler

Here is a comparison table of the main types in C ++ and in MIPS:
Comparative table of types in C ++ and MIPS

As you can see in the table, the choice of the type in for a variable in MIPS is based only on the amount of memory that this variable will occupy. Please note that MIPS in this regard does not distinguish between signed and unsigned variables.

Tags (characters)

In the code above, we used several labels.
Labels (also called symbols or labels) are used to give “names” to addresses in memory. These symbols are divided into 2 large classes: data labels (addresses of global variables that are in the .data section, in this case v: and instruction labels (addresses of instructions in the .text section, for example main: loop: .
The data in the .data section is usually stored in memory starting at address 0x10010000. Instructions are stored starting at address 0x00400000. Since each MIPS assembler instruction occupies exactly 32 bits, the following label-address table will be true for our program:
Table & quot; label-address & quot;

Using tags is very convenient to work with global variables and other data from .data , but more on that later.

Basic Directives

We have already reviewed several directives, namely .data and .text , and we already know that the first is for storing data and declaring global variables, and the second is for the program code itself. Look at the rest of the MIPS directives:

```
 .globl sym 
```
declares the sym symbol global and allows you to access it from other files;
```
 .extern sym size 
```
declares that the data that is stored in sym is of size size , and makes sym a global label (see previous directive);
```
 .ascii str 
```
stores the string str in memory without adding a null character (\ 0) to the end;
```
 .asciiz str 
```
saves the string str and appends a null character (\ 0) to the end;
```
 .byte b1, b2, ..., bn 
```
successively stores bytes b1, b2, ..., bn ;
```
 .half h1, h2, ..., hn 
```
consistently stores in memory 16-bit values h1, h2, ..., hn ;
```
 .word w1, w2, ..., wn 
```
consistently stores in memory 32-bit values of w1, w2, ..., wn ;
```
 .dword dw1, dw2, ..., dwn 
```
consistently stores in memory 64-bit values dw1, dw2, ..., dwn ;
```
 .float f1, f2, ..., fn 
```
keeps in memory the floating-point numbers f1, f2, ..., fn ;
```
 .double d1, d2, ..., dn 
```
remembers floating-point numbers (double precision) d1, d2, ..., dn ;
```
 .space n 
```
allocate n bytes in this data segment;
```
 .align n 
```
align all the following data to 2 ^ n bytes.

Regarding the last directive: let's say we wrote .align 1 in .data . In this case, even if we store a value of 1 byte in the address 0x10010000, for example, the next byte will be left empty, and if we want to write another byte in memory, it will already receive the address 0x10010002. In MIPS, automatic alignment of data is enabled by default, and therefore you can write a 16-bit value ( .half ) only into an even memory address (0x10010000, 0x10010002, but not 0x10010003), a 32-bit value — only an address that is a multiple of 4, and 64 -bit - only to the address multiple of 8.

Data Format in `.data`

Data in .data written in a fairly free manner. You just need to specify the label, data type and value. In this code there are several examples of correct data writing:

 .data var1: .byte 'A', 0xF3, 127, -1, '\n' var2: .half -10, 0xffff var3: .word 0x12345678 var4: .float 12.3, -0.1 var5: .double 1.5e-10 var6: .dword 0x1234567812345678 str1: .ascii “i love mips\n" str2: .asciiz “zero-finished string" array: .space 100

A little deeper, we look at data types as they are used in the code.

Registers

One of the main parts of the MIPS processor is the registers. The standard MIPS processor has 32 main registers and another 32 in the first coprocessor, the module used for floating-point calculations. Each register has a size of 32 bits, respectively, one value of type int is entirely placed into it. To store a variable of type long you must use two registers at once. Each register can be accessed by its ordinal name and by its common name. The overall is a bit more human-readable. The following registers are available:

$ zero ($ 0) is a register that always contains the value 0 and is read-only;
$ at ($ 1) - temporary processor register;
$ v0- $ v1 ($ 2- $ 3) - for the results returned by the functions;
$ a0- $ a3 ($ 4- $ 7) - for the arguments of functions;
$ t0- $ t9 ($ 8- $ 15, $ 24- $ 25) - for temporary data, you can use as you like;
$ s0- $ s8 ($ 16- $ 23, $ 30) - for permanent data, you can use it as you like;
$ k0- $ k1 ($ 26- $ 27) - reserved for the operating system kernel;
$ gp ($ 28) - Pointer for global variables, practically not used;
$ sp ($ 29) - Pointer stack, its value is always equal to the top address of the stack;
$ ra ($ 31) - the sun god address of the instruction from which the function was called;
$ f0 for floating-point results;
$ f4, $ f6, $ f8, $ f10, $ f16, $ f18 - for temporal floating-point data;
$ f12, $ f14 - for parameters of functions with floating point

MIPS instructions

Note. From now on, we will look at the MIPS processor, its instructions and additions using the example of a wonderful MIPS simulator called MARS , which can be downloaded from here . The implementation of MIPS in this simulator fully complies with the standards.

In the code at the beginning of the article, we have already identified all the functional parts of the program and defined instructions and pseudoinstructions as something that is not a comment, a symbol (label), or a directive. Pseudoinstructions are also called macros; they are transformed into one or more instructions during code execution. Here is an example macro:

 la rdest, addr

goes to the instruction set:

 lui $at, hi(addr) ori rdest, $at, lo(addr)

As you can see, MIPS programs are always written one instruction per line.

Types of instructions

There are three main types of MIPS assembler instructions:

type R (register). Three registers are used as operands — the destination register (abbr. $ Rd), the first argument ($ rs), and the second argument ($ rt). An example of such an instruction is the addition of three registers:
```
 add $t2, $t0, $t1 
```
In this case, the result of adding values in $ t0 and $ t1 will be placed in $ t2.
type I (immediate). Operands - two registers and a number. Example of type I instructions:
```
 addi $t3, $t2, 12 
```
After execution, the result of adding $ t2 and the number 12 will be placed in the register $ t3.
Type J (jump). The only operand is the 26-bit address to go to. Instruction
```
 j 128 
```
will go to address 128 in .text .

There are also instructions for coprocessors, but we will look at them later.

Syscall instruction

syscall is one of the most simple, but at the same time one of the most significant instructions of the MIPS-assembler. This is a service instruction, so it is considered separately from the rest. syscall used to access the operating system to perform actions that the processor itself is unable to perform. Before calling this instruction, you need to put the service code in the $ v0 register - a natural number from 1 to 12. Depending on the code, the operating system will perform one or the other action. Here is a list of service codes and their corresponding operating system actions that MARS supports:

Syscall table

All input and output occurs through the MARS'a console.

Arithmetic instructions

So, consider some basic arithmetic instructions. Some abbreviations will be used: rd is the register where the result is written, rs is the first argument, rt is the second argument. imm16 - a 16-bit integer or imm5 - a 5-bit natural number can also occur.

```
 add rd, rs, rt 
```
the sum of rs and rt is written to the rd register. Gentle, may cause overflow.
```
 sub rd, rs, rt 
```
rd = rs - rt. You can also get an overflow.
```
 addu rd, rs, rt 
```
almost the same as the previous instruction, but this cannot cause overflow. For arithmetic calculations, it is preferable to use this particular instruction.
```
 subu rd, rs, rt 
```
rd = rs - rt. Also without overflow, and therefore recommended to use.
```
 addi rd, rs, imm16 
```
rt = rs + 16-bit integer. Like add , it can cause overflow.
```
 addiu rd, rs, imm16 
```
the same, but without the possibility of overflow. Use it.

By the way, imm16 is interpreted as positive by default. For example:

 addiu $s1, $zero, 0xFFFF # $s1 = 0x0000FFFF ( )

If you need to add a negative value, you need to explicitly indicate this:

 addiu $s1, $zero, -0xFFFF # $s1 = 0xFFFF0001 (     2)

Let's look at real calculations using these instructions. Take, for example, the following code (in C ++):
int f = (g+h) - (ij);
And translate this code into MIPS instructions. First you need to calculate the value to the right of the '=' sign, and then assign it to the variable f. Suppose that the variable f we will have in the register $ s0, g - in $ s1, h - in $ s2, i - in $ s3, and j - in $ s4. This is what happens:

 addu $t0, $s1, $s2 # t0 = s1 + s2 = g + h subu $t1, $s3, $s4 # t1 = s3 - s4 = i - j subu $s0, $t0, $t1 # s0 = f = t0 - t1 = (g+h) - (ij)

And now you can test the resulting code in MARS. Download the draft from here and open it in MARS:

 java -jar Mars_4_2.jar

Add code instead of a comment. Now you can do it. First select Run -> Assemble:

MARS Assemble operation

Now uncheck Hexadecimal Values to see decimal values in registers and select Run -> Go:

Mars run operation

The value in $ s0 after program execution must be equal to 12.

Registers after execution

To be continued

In the next article we will look at logical instructions, as well as multiplication and division of integers. In it we will try to work with memory and a stack. In the meantime, I suggest you try to rewrite this code here into a MIPS assembler:

 int a = b + c; int d = e + f; int g = a + b; int h = g + d;

Thanks for attention!

Source: https://habr.com/ru/post/147685/

All Articles