📜 ⬆️ ⬇️

Creating an emulator arcade machine. Part 3

image

Parts one and two .

8080 processor emulator


Emulator shell


Now you should have all the necessary knowledge to start creating an 8080 emulator.

I will try to make my code as clear as possible, each opcode is implemented separately. When you get comfortable with it, you may want to rewrite it to optimize performance or reuse code.
')
To begin with, I will create a memory structure that will contain fields for everything that I thought was necessary when writing a disassembler. Also there will be a place for a memory buffer, which will be a RAM.

typedef struct ConditionCodes { uint8_t z:1; uint8_t s:1; uint8_t p:1; uint8_t cy:1; uint8_t ac:1; uint8_t pad:3; } ConditionCodes; typedef struct State8080 { uint8_t a; uint8_t b; uint8_t c; uint8_t d; uint8_t e; uint8_t h; uint8_t l; uint16_t sp; uint16_t pc; uint8_t *memory; struct ConditionCodes cc; uint8_t int_enable; } State8080; 

Now we will create a procedure with an error call that will terminate the program with an error. It will look something like this:

  void UnimplementedInstruction(State8080* state) { // pc    ,     printf ("Error: Unimplemented instruction\n"); exit(1); } int Emulate8080Op(State8080* state) { unsigned char *opcode = &state->memory[state->pc]; switch(*opcode) { case 0x00: UnimplementedInstruction(state); break; case 0x01: UnimplementedInstruction(state); break; case 0x02: UnimplementedInstruction(state); break; case 0x03: UnimplementedInstruction(state); break; case 0x04: UnimplementedInstruction(state); break; /*....*/ case 0xfe: UnimplementedInstruction(state); break; case 0xff: UnimplementedInstruction(state); break; } state->pc+=1; //  } 

Let's implement several opcodes.

  void Emulate8080Op(State8080* state) { unsigned char *opcode = &state->memory[state->pc]; switch(*opcode) { case 0x00: break; //NOP -  ! case 0x01: //LXI B, state->c = opcode[1]; state->b = opcode[2]; state->pc += 2; //   2  break; /*....*/ case 0x41: state->b = state->c; break; //MOV B,C case 0x42: state->b = state->d; break; //MOV B,D case 0x43: state->b = state->e; break; //MOV B,E } state->pc+=1; } 

Like this. For each opcode, we change state and memory, as the command running on the present 8080 would do.

The 8080 has about 7 types, depending on how they are classified:


Let's look at each one separately.

Arithmetic group


Arithmetic commands are many of the 256 opcodes of the 8080 processor, which include various types of summation and subtraction. Most arithmetic instructions work with register A and store the result in A. (Register A is also called a accumulator).

It is interesting to note that these commands affect the condition codes. State codes (also called flags) are set depending on the result of the executed command. Not all commands affect flags, and not all commands that affect flags affect all flags at once.

Flags 8080


In the 8080 processor, the flags are called Z, S, P, CY, and AC.


State codes are used in conditional branching commands, for example, JZ performs branching only if the Z flag is set.

Most instructions have three forms: for registers, for immediate values ​​and for memory. Let's implement a few instructions to understand their forms and see what working with state codes is like. (Note that I do not implement the auxiliary carry flag, because it is not used. If I implemented it, I would not be able to test it.)

Register Form


Here is an example of the implementation of two instructions with a form for the register; in the first one, I deployed the code to make it clearer, while the second presents a more compact view that does the same.

  case 0x80: //ADD B { //      , //      uint16_t answer = (uint16_t) state->a + (uint16_t) state->b; //  :    , //    , //      if ((answer & 0xff) == 0) state->cc.z = 1; else state->cc.z = 0; //  :   7 , //    , //      if (answer & 0x80) state->cc.s = 1; else state->cc.s = 0; //   if (answer > 0xff) state->cc.cy = 1; else state->cc.cy = 0; //    state->cc.p = Parity( answer & 0xff); state->a = answer & 0xff; } //  ADD     case 0x81: //ADD C { uint16_t answer = (uint16_t) state->a + (uint16_t) state->c; state->cc.z = ((answer & 0xff) == 0); state->cc.s = ((answer & 0x80) != 0); state->cc.cy = (answer > 0xff); state->cc.p = Parity(answer&0xff); state->a = answer & 0xff; } 

I emulate 8-bit math commands with a 16-bit number. This makes it easier to keep track of when calculations generate a transfer.

Form for immediate values


The form for immediate values ​​is almost the same, except that the source of the added is a byte after the command. Since the "opcode" is a pointer to the current instruction in memory, opcode [1] will be the next byte directly.

  case 0xC6: //ADI  { uint16_t answer = (uint16_t) state->a + (uint16_t) opcode[1]; state->cc.z = ((answer & 0xff) == 0); state->cc.s = ((answer & 0x80) != 0); state->cc.cy = (answer > 0xff); state->cc.p = Parity(answer&0xff); state->a = answer & 0xff; } 

Memory form


In the form for the memory will be added bytes, which indicates the address stored in a pair of registers HL.

  case 0x86: //ADD M { uint16_t offset = (state->h<<8) | (state->l); uint16_t answer = (uint16_t) state->a + state->memory[offset]; state->cc.z = ((answer & 0xff) == 0); state->cc.s = ((answer & 0x80) != 0); state->cc.cy = (answer > 0xff); state->cc.p = Parity(answer&0xff); state->a = answer & 0xff; } 

Notes


The remaining arithmetic commands are implemented in a similar way. Additions:


Branch group


After you understand the state codes, the branch group will be clear enough for you. There are two types of branching - transitions (JMP) and calls (CALL). JMP simply sets the PC to the destination jump address. CALL is used for routines, it writes the return address to the stack, and then assigns the PC a target address. RET returns from CALL, getting the address from the stack and writing it to the PC.

Both JMP and CALL navigate only to absolute addresses that are encoded in bytes after the opcode.

Jmp


The JMP command performs an unconditional branch to the target address. There are also conditional jump instructions for all state codes (except for AC):


Here is the implementation of some of them:

  case 0xc2: //JNZ  if (0 == state->cc.z) state->pc = (opcode[2] << 8) | opcode[1]; else //    state->pc += 2; break; case 0xc3: //JMP  state->pc = (opcode[2] << 8) | opcode[1]; break; 

CALL and RET


CALL writes to the stack the address of the instruction after the call, and then goes to the target address. RET gets the address from the stack and saves it to the PC. There are conditional versions of CALL and RET for all states.


  case 0xcd: //CALL  { uint16_t ret = state->pc+2; state->memory[state->sp-1] = (ret >> 8) & 0xff; state->memory[state->sp-2] = (ret & 0xff); state->sp = state->sp - 2; state->pc = (opcode[2] << 8) | opcode[1]; } break; case 0xc9: //RET state->pc = state->memory[state->sp] | (state->memory[state->sp+1] << 8); state->sp += 2; break; 

Notes



Logical group


This group performs logical operations (see first tutorial post ). By their nature, they are similar to an arithmetic group in that most operations work with register A (accumulator), and most operations affect flags. All operations are performed on 8-bit values, in this group there are no commands that affect pairs of registers.

Boolean operations


AND, OR, NOT (CMP) and “exclusive or” (XOR) are called boolean operations. OR and AND I explained earlier. The NOT command (for the 8080 processor, it is called CMA, or complement accumulator) simply changes the values ​​of the bits — all units become zeros and zeros become ones.

I see XOR as a “difference recognizer.” Its truth table looks like this:

xyResult
000
0oneone
one0one
oneone0

AND, OR, and XOR have a form for registers, memory, and immediate values. (CMP has only a case-sensitive command). Here is an implementation of a pair of opcodes:

  case 0x2F: //CMA (not) state->a = ~state->a //  ,  CMA     break; case 0xe6: //ANI  { uint8_t x = state->a & opcode[1]; state->cc.z = (x == 0); state->cc.s = (0x80 == (x & 0x80)); state->cc.p = parity(x, 8); state->cc.cy = 0; //  ,  ANI  CY state->a = x; state->pc++; //   } break; 

Cyclic shift commands


These commands change the order of the bits in the registers. A shift to the right moves them to the right by one bit, and a shift to the left — to the left by one bits:

(0b00010000) = 0b00001000

(0b00000001) = 0b00000010

It seems that they are useless, but in reality it is not. They can be used to multiply and divide by powers of two. Take for example the shift to the left. 0b00000001 is decimal 1, and shifting it to the left makes it 0b00000010 , that is, decimal 2. If we perform another shift to the left, we will get 0b00000100 , that is, 4. Another shift to the left, and we multiplied by 8. It will work with any in numbers: 5 ( 0b00000101 ) when shifting to the left gives 10 ( 0b00001010 ). Another left shift gives 20 ( 0b00010100 ). Shift to the right does the same, but for division.

The 8080 does not have a multiplication command, but it can be implemented using these commands. If you understand how to do this, you will receive bonus points. Once such a question I was asked at the interview. (I managed, although it took me a few minutes.)

These commands perform a cyclic shift of the drive and affect only the carry flag. Here are a couple of commands:

  case 0x0f: //RRC { uint8_t x = state->a; state->a = ((x & 1) << 7) | (x >> 1); state->cc.cy = (1 == (x&1)); } break; case 0x1f: //RAR { uint8_t x = state->a; state->a = (state->cc.cy << 7) | (x >> 1); state->cc.cy = (1 == (x&1)); } break; 

Comparison


The task of the CMP and CPI is only to set the flags (for branching). They do this by subtracting, setting flags, but not saving the result.


There are versions of these commands for registers, memory, and immediate values. The implementation consists in a simple subtraction without saving the result:

  case 0xfe: //CPI  { uint8_t x = state->a - opcode[1]; state->cc.z = (x == 0); state->cc.s = (0x80 == (x & 0x80)); //  ,    p -   state->cc.p = parity(x, 8); state->cc.cy = (state->a < opcode[1]); state->pc++; } break; 

CMC and STC


They complete the logical group. They are used to set and reset the carry flag.

I / O group and special commands


These commands cannot be assigned to any other category. I will mention them for completeness, but as it seems to me, we will have to return to them again when we start to emulate the “hardware» of Space Invaders.


Another application of NOP is code modification. Suppose we need to change the ROM code of the game. We can’t just delete unnecessary opcodes, because we don’t want to change all the CALL and JMP commands (they will be wrong if at least one part of the code moves). With the help of NOP, we can get rid of the code. Adding code is much more difficult! It can be added by finding somewhere in the ROM space and changing the command to JMP.

Stack group


We have already completed the mechanics for most of the teams in the stack group. If you did the work with me, then these commands will be easy to implement.

PUSH and POP


PUSH and POP work only with pairs of registers. PUSH writes a pair of registers to the stack, and POP takes 2 bytes from the top of the stack and writes them into a pair of registers.

There are four opcodes for PUSH and POP, one for each pair: BC, DE, HL and PSW. PSW is a special pair of registers of drive flags and status codes. Here is my implementation of PUSH and POP for BC and PSW. There are no comments in it - I do not think that there is anything particularly clever.

  case 0xc1: //POP B { state->c = state->memory[state->sp]; state->b = state->memory[state->sp+1]; state->sp += 2; } break; case 0xc5: //PUSH B { state->memory[state->sp-1] = state->b; state->memory[state->sp-2] = state->c; state->sp = state->sp - 2; } break; case 0xf1: //POP PSW { state->a = state->memory[state->sp+1]; uint8_t psw = state->memory[state->sp]; state->cc.z = (0x01 == (psw & 0x01)); state->cc.s = (0x02 == (psw & 0x02)); state->cc.p = (0x04 == (psw & 0x04)); state->cc.cy = (0x05 == (psw & 0x08)); state->cc.ac = (0x10 == (psw & 0x10)); state->sp += 2; } break; case 0xf5: //PUSH PSW { state->memory[state->sp-1] = state->a; uint8_t psw = (state->cc.z | state->cc.s << 1 | state->cc.p << 2 | state->cc.cy << 3 | state->cc.ac << 4 ); state->memory[state->sp-2] = psw; state->sp = state->sp - 2; } break; 

SPHL and XTHL


In the stack group there are two more teams - SPHL and XTHL.


A little bit more about binary numbers


When writing a computer program, one of the solutions you need to make is the choice of the type of data used for the numbers - whether you want them to be negative and what their maximum size should be. For a CPU emulator, we need the data type to match the data type of the target CPU.

Signed and unsigned


When we started talking about hex numbers, we considered them unsigned — that is, each binary digit of a hexadecimal number had a positive value, and each was considered a power of two (one, two, four, etc.).

We dealt with the storage of computer negative numbers. If you know that the data in question has a sign, that is, they can be negative, then you can recognize a negative number by the most significant bit of the number (most significant bit, MSB). If the data size is one byte, then each number with a specified value of the MSB bit is negative, and each with a zero MSB is positive.

The value of a negative number is stored as an additional code. If we have a sign number, and MSB is one, and we want to know what the number is, then we can convert it as follows: execute a binary "NOT" for hex numbers, and then add one.

For example, for a hex number of 0x80, the MSB bit is set, that is, it is negative. The binary NOT of the number 0x80 is 0x7f, or decimal 127. 127 + 1 = 128. That is, 0x80 in decimal form is -128. The second example: 0xC5. Not (0xC5) = 0x3A = decimal 58 +1 = decimal 59. That is, 0xC5 is decimal -59.

Surprisingly in numbers with additional code, we can perform calculations with them as with unsigned numbers, and they will still work . The computer does not need to do anything special with signs. I will show a couple of examples proving this.

  Example 1

      decimal hex binary    
       -3 0xFD 1111 1101    
    + 10 0x0A +0000 1010    
    ----- -----------    
        7 0x07 1 0000 0111    
                        ^ This is entered in the carry bit.

    Example 2    

      decimal hex binary    
      -59 0xC5 1100 0101    
    + 33 0x21 +0010 0001    
    ----- -----------    
      -26 0xE6 1110 0110 


In Example 1, we see that when adding 10 and -3 we get 7. The transfer of the result of addition was performed, therefore the C flag can be set. In Example 2, the result of addition was negative, therefore we decode it: Not (0xE6) = 0x19 = 25 + 1 = 26. 0xE6 = -26 Brain explosion!

If you want, read more about the additional code in Wikipedia .

Data types


In the C language, there is a relationship between data types and the number of bytes used for this type. In fact, we are only interested in integers. The standard / old school C data types are char, int and long, as well as their friends unsigned char, unsigned int and unsigned long. The problem is that on different platforms and in different compilers, these types can have different sizes.

Therefore, it would be best to choose a data type for our platform that explicitly declares the size of the data. If your platform has stdint.h, you can use int8_t, uint8_t, etc.

The size of an integer determines the maximum number that can be stored in it. In the case of unsigned integers in 8 bits, you can store numbers from 0 to 255. If converted to hex, it is from 0x00 to 0xFF. Since 0xFF "all bits are set," and it corresponds to the decimal 255, it is absolutely logical that the interval of one-byte unsigned integer is 0-255. Intervals tell us that all the sizes of integers will work exactly the same - the numbers correspond to the number that is obtained by setting all the bits.

Type ofIntervalHex
8-bit unsigned0-2550x0-0xFF
8-bit signed-128-1270x80-0x7F
16-bit unsigned0-655350x0-0xFFFF
16-bit signed-32768-327670x8000-0x7FFF
32-bit unsigned0-42949672950x0-0xFFFFFFFF
32-bit signed-2147483648-21474836470x80000000-0x7FFFFFFF

More interestingly, the -1 in each character data type is a number that has all the bits set (0xFF for the sign byte, 0xFFFF for the sign 16-bit number, and 0xFFFFFFFF for the sign 32-bit number). If the data is considered unsigned, then for all given bits the maximum possible number for this data type is obtained.

To emulate the registers of the processor, we select the data type corresponding to the size of this register. Probably the default is to select unsigned types and convert them when you need to consider them signed. For example, we use the uint8_t data type to represent an 8-bit register.

Hint: use a debugger to convert data types


If gdb is installed on your platform, it is very convenient to use it for working with binary numbers. Below I will show an example — in the session shown below, the lines beginning with # are comments added by me later.

# /c, gdb
(gdb) print /c 0xFD
$1 = -3 '?'

# /x, gdb hex
# "p" "print"
(gdb) p /c 0xA
$2 = 10 '\n'

# 2 " "
(gdb) p /c 0xC5
$3 = -59 '?'
(gdb) p /c 0xC5+0x21
$4 = -26 '?'

# print , gdb
(gdb) p 0x21
$9 = 33

# , gdb,
# ,
(gdb) p 0xc5
$5 = 197 #
(gdb) p /c 0xc5
$3 = -59 '?' #
(gdb) p 0xfd
$6 = 253

# ( 32- )
(gdb) p /x -3
$7 = 0xfffffffd

# 1
(gdb) print (char) 0xff
$1 = -1 '?'
# 1
(gdb) print (unsigned char) 0xff
$2 = 255 '?'


When I work with hex numbers, I always do it in gdb - and this happens almost every day. So much easier than opening a programmer's calculator with a GUI. On Linux (and Mac OS X) machines, to start a gdb session, simply open the terminal and enter “gdb”. If you are using Xcode on OS X, then after running the program, you can use the console inside Xcode (the one to which the printf output is output). On Windows, the gdb debugger is available from Cygwin.

Cpu emulator completion


Having received all this information, you are ready for a long journey. You have to decide for yourself how to implement the emulator — either create a full 8080 emulation, or implement only the commands necessary to complete the game.

If you decide to do a full emulation, then you will need a few more tools. I will talk about them in the next section.

Another way is to emulate only the instructions used by the game. We will continue to fill that huge construction of switch, which was created in the "Emulator Shell" section. We will repeat the following process until we have not a single unrealized team:

  1. We start the emulator from ROM Space Invaders
  2. The call UnimplementedInstruction()is exiting if the command is not ready.
  3. We emulate this instruction
  4. Goto 1

, , — . , :

  int Emulate8080Op(State8080* state) { unsigned char *opcode = &state->memory[state->pc]; Disassemble8080Op(state->memory, state->pc); switch (*opcode) { case 0x00: //NOP /* ... */ } /*    */ printf("\tC=%d,P=%d,S=%d,Z=%d\n", state->cc.cy, state->cc.p, state->cc.s, state->cc.z); printf("\tA $%02x B $%02x C $%02x D $%02x E $%02x H $%02x L $%02x SP %04x\n", state->a, state->b, state->c, state->d, state->e, state->h, state->l, state->sp); } 

.

: , 50 , 8080. , :

Team
0x00NOP
0x01LXI B,D16
0x05DCR B
0x06MVI B,D8
0x09DAD B
0x0dDCR C
0x0eMVI C,D8
0x0fRRC
0x11LXI D,D16
0x13INX D
0x19DAD D
0x1aLDAX D
0x21LXI H,D16
0x23INX H
0x26MVI H,D8
0x29DAD H
0x31LXI SP,D16
0x32STA adr
0x36MVI M,D8
0x3aLDA adr
0x3eMVI A,D8
0x56MOV D,M
0x5eMOV E,M
0x66MOV H,M
0x6fMOV L,A
0x77MOV M,A
0x7aMOV A,D
0x7bMOV A,E
0x7cMOV A, H
0x7eMOV A, M
0xa7ANA A
0xafXRA A
0xc1Pop b
0xc2Jnz adr
0xc3Jmp adr
0xc5Push b
0xc6ADI D8
0xc9RET
0xcdCall adr
0xd1Pop d
0xd3OUT D8
0xd5Push d
0xe1Pop h
0xe5Push h
0xe6ANI D8
0xebXchg
0xf1POP PSW
0xf5Push psw
0xfbEI
0xfeCPI D8

These are only 50 instructions, and 10 of them are movements that are implemented trivially.

Debugging


. , . , ( , ), , .

— , . , , . , . . , , , :

  1. For the next team
  2. Call your emulator with your status.
  3. Call my with my condition
  4. Compare our two states
  5. We look for errors in any differences.
  6. goto 3

Another way is to use this site manually . This is an emulator processor 8080 for Javascript, which even has a built-in ROM Space Invaders. Here is the process:

  1. Restart Space Invaders emulation by clicking the "Space Invaders" button
  2. Press the "Run 1" button to execute the command.
  3. We execute the following command in our emulator
  4. Compare the status of the processor with your
  5. If the states match, goto 2
  6. If the states do not match, then your instruction emulation is wrong. Repair it, and then start over from step 1.

I used this method at the beginning to debug my 8080 emulator. I will not lie - the process can be a long one. Many of my problems resulted in typos and copy-pasting errors, which after detection were very easy to fix.

If you step through your code step by step, then most of the first 30 thousand commands are executed in a loop around $ 1a5f. If you look at the emulator in javascript , you can see that this code copies the data to the screen. I am sure that this code is called frequently.

After the first screen drawing, after 50 thousand commands, the program gets stuck in this infinite loop:

  0ada LDA $20c0 0add ANA A 0ade JNZ $0ada 

He waits until the value in memory at $ 20c0 changes to zero. Since the code in this loop definitely does not change $ 20c0, it must be a signal from somewhere else. It's time to talk about the emulation of the "iron" arcade machine.

Before we move on to the next section, make sure your CPU emulator falls into this infinite loop.

For help, see my sources .

8080 full emulation


, : , . . , . , .

, 8080 . 8080 cpudiag.asm, 8080.

:

  1. , . , cpudiag.asm .
  2. , . , , .

. .


, . cpudiag.asm , . , , , «Make Beautiful Code» , test.bin, 8080. , .

cpudiag.asm .

cpudiag.bin ( 8080) .


invaders.* .

. -, ORG 00100H , , , 0x100 hex. 8080, , . , , , 0x100.

-, , . hex- JMP $0100 , . ( PC 0x100.)

-, . , STACK EQU TEMPP+256 , . , $6ad, PUSH . , 0x100, , , «0x7» , .

, DAA , , ( JMP).

  ReadFileIntoMemoryAt(state, "/Users/kpmiller/Desktop/invaders/cpudiag.bin", 0x100); //  ,   JMP 0x100 state->memory[0]=0xc3; state->memory[1]=0; state->memory[2]=0x01; //Fix the stack pointer from 0x6ad to 0x7ad // this 0x06 byte 112 in the code, which is // byte 112 + 0x100 = 368 in memory state->memory[368] = 0x7; //  DAA state->memory[0x59c] = 0xc3; //JMP state->memory[0x59d] = 0xc2; state->memory[0x59e] = 0x05; 


, CP/M. , CP/M $0005, , CALL, . , , , . CALL :

  case 0xcd: //CALL  #ifdef FOR_CPUDIAG if (5 == ((opcode[2] << 8) | opcode[1])) { if (state->c == 9) { uint16_t offset = (state->d<<8) | (state->e); char *str = &state->memory[offset+3]; // - while (*str != '$') printf("%c", *str++); printf("\n"); } else if (state->c == 2) { //    ,   ,    printf ("print char routine called\n"); } } else if (0 == ((opcode[2] << 8) | opcode[1])) { exit(0); } else #endif { uint16_t ret = state->pc+2; state->memory[state->sp-1] = (ret >> 8) & 0xff; state->memory[state->sp-2] = (ret & 0xff); state->sp = state->sp - 2; state->pc = (opcode[2] << 8) | opcode[1]; } break; 

With this test, I found several problems in my emulator. I'm not sure which of them would be involved in the game, but if they were, then it would be very difficult to find them.

I went ahead and implemented all opcodes (except for DAA and his friends). Solving problems in my calls and implementing new ones took me 3-4 hours. It was definitely faster than the manual process I described above — before I found this test, I spent more than 4 hours on the manual process. If you can understand this explanation, I recommend using this method instead of comparing manually. However, knowledge of the manual process is also an excellent skill, and if you want to emulate a different processor, you should return to it.

, . , .

Source: https://habr.com/ru/post/418699/


All Articles