📜 ⬆️ ⬇️

Writing a simple processor and environment for it

Hello! In this article I will tell you what steps you need to go to create a simple processor and environment for it.


Command Set Architecture (ISA)


First you need to decide on how the processor will be. Important parameters such as:



Processor architectures can be divided by the size of instructions into 2 types (in fact, there are more, but other options are less popular):



The main difference is that the RISC processors have the same size instructions. Their instructions are simple and run relatively quickly, whereas CISC processors can have different size instructions, some of which can be executed for quite a long time.


I decided to make the RISC processor much like MIPS .


I did this for a variety of reasons:



Here are the main characteristics of my processor:



Register type (extension. Register type) looks like this:


rtype


The peculiarity of such instructions is that they operate with three registers.


Immediate type (lit. Immediate type):


itype


Instructions of this type operate on two registers and a number.


OP is the number of the instruction to be executed (or to indicate that this Register type instruction).


R0 , R1 , R2 are numbers of registers that serve as operands for instructions.


Func is an additional field that serves to specify the type of Register type instructions.


Imm is the field where the value is written that we want to explicitly provide instructions as an operand.



A complete list of instructions can be viewed in the github repository .


Here are just a couple of them:


nor r0, r1, r2 

NOR is the Register type instruction, which makes a logical OR NOT on the registers r1 and r2, then writes the result to the register r0.


In order to use this instruction, you need to change the OP field to 0000 and the Func field to 0000000111 in the binary number system.


 lw r0, n(r1) 

LW is an Immediate type instruction that loads the memory value at address r1 + n into the register r0.


In order to use this instruction, in turn, you need to change the OP field to 0111 , and write the number n in the IMM field.


Writing processor code


After creating the ISA, you can begin to write the processor.


To do this, we need the knowledge of any hardware description language. Here are some of them:



I chose Verilog, because programming on it was part of my university course.


To write a processor, you need to understand the logic of its work:


  1. Getting instructions at the Team Counter (PC) address
  2. Decoding instructions
  3. Execution of instructions
  4. Adding to the team counter the size of the instruction executed

And so on to infinity.


It turns out you need to create several modules:



Separate each module separately.


Register file


Register file provides access to registers. With its help, you need to get the values ​​of some registers, or change them.


In my case, I have 64 registers. One of the registers records the result of the operation on the other two, so I need to provide the ability to change only one, and get the values ​​from the other two.


Decoder


A decoder is the unit that is responsible for decoding instructions. It indicates which operations need to be performed by the ALU and other units.


For example, the addi instruction must add the value of the $ zero register (It always stores 0 ) and 20 and put the result in the $ t0 register.


 addi $t0, $zero, 20 

At this stage, the decoder determines that this instruction:



And passes this information to the following blocks.


ALU


After management goes to the ALU. It usually performs all mathematical, logical operations, as well as operations of comparing numbers.


That is, if we consider the same instruction addi , then at this stage the addition of 0 and 20 occurs.


Other


In addition to the above blocks, the processor should be able to:



Here and there you can see how it looks in code.


Assembler


After writing the processor, we need a program that would convert text commands into machine code in order not to do it manually. Therefore, you need to write an assembler.


I decided to implement it in the C programming language.


Since my processor has a RISC architecture, in order to simplify my life, I decided to design an assembler so that you can easily add your pseudoinstructions (combinations of several basic instructions or other pseudoinstructions) to it.


You can do this with the help of a data structure that stores the type of instruction, its format, a pointer to a function that returns the machine instruction codes, and its name.


A regular program begins with a segment announcement.


For us, two .text segments are enough - in which the source code of our programs will be stored - and .data - in which our data and constants will be stored.


The instruction may look like this:


 .text jie $zero, $zero, $zero #  addi $t1, $zero, 2 # $t1 = $zero + 2 lw $t1, 5($t2) # $t1 = *($t2 + 5) syscall 0, $zero, $zero # syscall(0, 0, 0) la $t1, label# $t1 = label 

First, the name of the instruction, then the operands.


In .data , data declarations are indicated.


 .data .byte 23 #   1  .half 1337 #   2  .word 69000, 25000 #   4  .asciiz "Hello World!" #     ( ) .ascii "12312009" #   ( ) .space 45 #  45  

The declaration must begin with a dot and the name of the data type, followed by constants or arguments.


It is convenient to parse (scan) the assembler file in this form:


  1. First we scan the segment
  2. If this is a .data segment, then we parse different data types or .text segment
  3. If this is a .text segment, then we parse the command or the .data segment.

For the assembler to work, you need to go through the source file 2 times. For the first time, he considers how the offsets are the links (they serve for), they usually look like this:


  la $s4, loop #   loop  s4 loop: # ! mul $s2, $s2, $s1 # s2 = s2 * s1 addi $s1, $s1, -1 # s1 = s1 - 1 jil $s3, $s1, $s4 #  s3 < s1     

And in the second pass, you can already generate a file.


Total


In the future, you can run the output file from the assembler on our processor and evaluate the result.


Also ready assembler can be used in C compiler. But it is already later.


References:



')

Source: https://habr.com/ru/post/430680/


All Articles