We program "Megaprocessor"

On Geektimes in the summer there was an article about the Megaprocessor - a processor of discrete transistors and LEDs, which weighs half a ton and occupies the entire living room in an ordinary townhouse near Cambridge. I decided to take advantage of my geographical proximity to this megaproject, and program something presentable for it - for example, to sport my previous Digital Rain program for Megaprocessor.

System commands Megaprocessor described on the developer's site .

Most commands consist of one byte, followed by an immediate operand (one or two bytes). There are only four general-purpose registers (R0-R3), while they are not equal: for example, for memory access commands, the address must be either in R2 or in R3; and the operand is in one of the two remaining registers.
')
Programmers accustomed to the x86 or ARM command system, the Megaprocessor command set will seem extremely poor: there is no indirect addressing "base + offset", no direct operands for arithmetic commands (except addq ±1 , addq ±2 ). But there are a couple of unexpected possibilities: a separate sqrt command, and the .wt mode for shift commands, which replaces the result with the sum of the extended bits. Thus you can, for example, a pair of commands ld.b r1, #15; lsr.wt r0, r1 ld.b r1, #15; lsr.wt r0, r1 calculate the number of single bits in r0 (a question so much favored by job interviewers!). The ln mnemonic for a command that loads an immediate value into the register (instead of the mnemonic mov usual for x86 or ARM) indicates the way it is executed: in fact, from the point of view of the processor, ld.b r1, (pc++) is executed.

So let's get started.

The program for Megaprocessor starts (at address 0) from the interrupt vector table. Each of the four vectors has four bytes. Starting at address 0x10, the actual program code can be located. Of the 64KB address space, the entire first half (up to the address 0x8000) can be used by the code; addresses 0xA000-0xA0FF correspond to the “display” - a discrete memory, each bit of which is equipped with an LED indicator. marks was wrong when writing “The memory capacity is 256 bytes.” is the amount of “video memory”, and not the main memory for the code and data.

Of the four interrupt vectors in our program, only the reset vector is used, and in the remaining vectors there is a “stub” from one reti instruction. (For x86 or ARM programmers, the return command from the interrupt handler is familiar under the iret mnemonic.) None of these interrupts in our program can happen anyway, so even the stubs could not be set for them.

 reset: jmp start; nop; ext_int: reti; nop; nop; nop; div_zero: reti; nop; nop; nop; illegal: reti; nop; nop; nop;

First of all, after the launch, you need to initialize the stack and variables. Let the stack grow downwards, starting at address 0x2000 - this is enough for us with a large margin. Variables will need only two: seed for the current RNG value, and the position array of 32 values — one for each “display column” —to keep track of where “drop” crawls in this column. We initialize the array with just 32 random bytes. The jsr command — subroutine call — matches the call in x86 or bl in ARM.

 start: ld.w r0, #0x2000; move sp, r0; // set random positions ld.b r1, #32; init_loop: jsr rand; // returns random value in r0 ld.b r2, #position; add r2, r1; st.b (r2), r0; addq r1, #-1; bne init_loop;

Since it is impossible to write a byte at the address (#position + r1) one command, you first have to calculate the address (#position + r1) separate addition command.

The main part of the program is an infinite loop in which we go from right to left along each “display column” and move the “drop” in it one position down. The low two bits of the “drop” denote its color (3 - “lit”; 0, 1 or 2 - “not lit”), the remaining six bits - the coordinate (0..63), therefore “shifting down” means adding 4. How only the “drop” crawled to the bottom of the “display” (the value exceeded 255), replacing it with a new random byte.

 busy_loop: ld.b r1, #32; next_col: ld.b r2, #position; add r2, r1; ld.b r0, (r2); addq r0, #2; addq r0, #2; btst r0, #8; beq save; jsr rand; save: st.b (r2), r0; addq r1, #-1; bmi busy_loop;

It is impossible to addq r0, #2 4 with one command, so we repeat addq r0, #2 twice, and then check the eighth bit of the result to determine if it has exceeded the value of 255. If it has exceeded, then save the new random value to the position array; otherwise, save the old one, incremented by 4. The conditional branch command bmi moves to the beginning of the busy_loop cycle if the result of the last action is negative, i.e. after processing the zero column.

How will we generate random numbers? RANDU , which I used in 32-bit “Digital Rain”, is no longer suitable: Megaprocessor is able to multiply only 16-bit numbers; therefore, from the list of simple RNGs, we take one, where the factor is 16-bit. I liked the RNG labeled “Turbo Pascal”.

 rand: ld.w r0, seed; ld.w r1, #33797; mulu; addq r2, #1; st.w seed, r2; move r0, r2; ret;

This simple and nice RNG returns the generated value to r0, but unfortunately it spoils the values of all the other registers. Note that in both cases, when we call rand , we have the index of the “display column” in r1 , and it needs to be saved and restored; and then in r2 should be an offset (#position + r1) . So you can put this offset into rand calculation:

 rand: push r1; // ! ld.w r0, seed; ld.w r1, #33797; mulu; addq r2, #1; st.w seed, r2; pop r1; // ! move r0, r2; ld.b r2, #position; // ! add r2, r1; // ! ret; start: ld.w r0, #0x2000; move sp, r0; // set random positions ld.b r1, #32; init_loop: jsr rand; st.b (r2), r0; addq r1, #-1; bne init_loop; busy_loop: ld.b r1, #32; next_col: ld.b r2, #position; add r2, r1; ld.b r0, (r2); addq r0, #2; addq r0, #2; btst r0, #8; beq save; jsr rand; save: st.b (r2), r0; addq r1, #-1; bmi busy_loop;

The last trick here is that calculating ld.b r2, #position; add r2, r1; ld.b r2, #position; add r2, r1; at the beginning of the next_col cycle, next_col can replace it by jumping into the rand subroutine:

 rand: push r1; ld.w r0, seed; ld.w r1, #33797; mulu; addq r2, #1; st.w seed, r2; pop r1; move r0, r2; add_position: ld.b r2, #position; add r2, r1; ret; start: <...> busy_loop: ld.b r1, #32; next_col: jsr add_position; // ! ld.b r0, (r2); addq r0, #2; addq r0, #2; btst r0, #8; beq save; jsr rand; save: st.b (r2), r0; addq r1, #-1; bmi busy_loop;

Now the most interesting thing is the second half of the next_col cycle, which will draw the “drop” on the display.

  move r3, r1; // x (0..1f) lsr r3, #3; // byte addr in row (0..3) ld.b r2, #0xfc; // y mask and r2, r0; // y * 4 (0..fc) add r3, r2; // byte addr in screen ld.w r2, #0xa000; add r3, r2; // byte addr in memory ld.b r2, #2; lsr.wt r0, r2; ld.b r2, #7; and r2, r1; // bit index in byte (0..7) lsl r2, #1; lsr r0, #2; roxr r2, #1; ld.b r0, (r3); // and now apply test r2; bpl blank; bset r0, r2; jmp apply; blank: bclr r0, r2; apply: st.b (r3), r0; jmp next_col;

In order to “ignite” or “extinguish” the desired bit, the first step is to calculate the address of the corresponding byte of the “video memory”. Since our “column” number is stored in r1 , and the position and “color” of a drop is in r0 , the byte address is calculated as (r1 >> 3) + (r0 & 0xfc) + 0xa000 . After that, the commands ld.b r2, #2; lsr.wt r0, r2; ld.b r2, #2; lsr.wt r0, r2; we determine the color of the drop: if both lower bits in r0 have been set, then as a result of these commands in r0 will be a value of 2; otherwise, the value is 0 or 1. Finally, in the three lower bits of r2 we memorize the number of the required bit of the “video memory”, and “push” into the high bit of r2 color of the drop with the sequence lsl r2, #1; lsr r0, #2; roxr r2, #1; lsl r2, #1; lsr r0, #2; roxr r2, #1; - the second command pushes the color bit from r0 to the CF flag, and the last (cyclic right shift with CF) pushes this bit into r2 . When the registers are not enough for all the necessary values, you have to be clever! Finally, a byte is retrieved from the “video memory” at the desired address, and depending on the color bit, this byte is either set or the required bit is reset. The bset and bclr use only the lower bits of their second operand, so the color bit in the high bit r2 does not interfere with them. We check this high bit with the sequence test r2; bpl blank; test r2; bpl blank; - the conditional jump command bpl performs the jump if the result of the last action is positive, i.e. bit color shot.

And that's what comes out of it:

Complete code

 reset: jmp start; nop; ext_int: reti; nop; nop; nop; div_zero: reti; nop; nop; nop; illegal: reti; nop; nop; nop; rand: push r1; ld.w r0, seed; ld.w r1, #33797; mulu; addq r2, #1; st.w seed, r2; pop r1; move r0, r2; add_position: ld.b r2, #position; add r2, r1; ret; start: ld.w r0, #0x2000; move sp, r0; // set random positions ld.b r1, #32; init_loop: jsr rand; st.b (r2), r0; addq r1, #-1; bne init_loop; busy_loop: ld.b r1, #32; next_col: jsr add_position; ld.b r0, (r2); addq r0, #2; addq r0, #2; btst r0, #8; beq save; jsr rand; save: st.b (r2), r0; addq r1, #-1; bmi busy_loop; move r3, r1; // x (0..1f) lsr r3, #3; // byte addr in row (0..3) ld.b r2, #0xfc; // y mask and r2, r0; // y * 4 (0..fc) add r3, r2; // byte addr in screen ld.w r2, #0xa000; add r3, r2; // byte addr in memory ld.b r2, #2; lsr.wt r0, r2; ld.b r2, #7; and r2, r1; // bit index in byte (0..7) lsl r2, #1; lsr r0, #2; roxr r2, #1; ld.b r0, (r3); // and now apply test r2; bpl blank; bset r0, r2; jmp apply; blank: bclr r0, r2; apply: st.b (r3), r0; jmp next_col; seed: dw 1; position:;

There was a final touch: to make the "drops" blink, as on the GIF-KDPV. In fact, this means that the program will work twice as slowly: at each iteration of the cycle, the busy_loop will first light and then extinguish each drop. On the lighting semi-iteration, it will be necessary to set two bits of video memory: for the current position of the “drop” and for the previous one (canceled by the last semi-iteration).

So, the “drop” must be ignited if a) the two lower bits of its value are both set; b) we are on the igniting half-littering - and extinguish in all other cases. The easiest way to do all this is to replace the command sequence that defines the color of the drop ( ld.b r2, #2; lsr.wt r0, r2; ) with the fixed value #2 with the flag variable, which will have the value 2 on the igniting half-iteration, and 1 on extinguishing:

 busy_loop: ld.b r1, #3; // ! ld.b r2, flag; // ! sub r1, r2; // ! st.b flag, r1; // ! ld.b r1, #32; next_col: jsr add_position; ld.b r0, (r2); ld.b r3, flag; // ! lsr r3, #1; // ! lsl r3, #2; // ! add r0, r3; // ! btst r0, #8; beq save; jsr rand; save: st.b (r2), r0; addq r1, #-1; bmi busy_loop; move r3, r1; // x (0..1f) lsr r3, #3; // byte addr in row (0..3) ld.b r2, #0xfc; // y mask and r2, r0; // y * 4 (0..fc) add r3, r2; // byte addr in screen ld.w r2, #0xa000; add r3, r2; // byte addr in memory ld.b r2, flag; // ! lsr.wt r0, r2;

At the beginning of the busy_loop loop busy_loop we subtract the current flag value from 3, i.e. change 2 by 1, and 1 by 2. Instead of moving the “drop” down at each iteration ( addq r0, #2; addq r0, #2; ), we add to r0 value (flag >> 1) << 2 , t . 4 on igniting semi-iteration, and 0 on quenching.

The last thing left to add is to set one more bit on the lighting half-lit, in a byte at offset -4 from the “drop” itself:

  // and now apply test r2; bpl blank; bset r0, r2; st.b (r3), r0; // ! addq r3, #-2; // ! addq r3, #-2; // ! btst r3, #8; // ! bne next_col; // ! ld.b r0, (r3); // ! bset r0, r2; // ! jmp apply; blank: bclr r0, r2; apply: st.b (r3), r0; jmp next_col;

Check btst r3, #8; bne next_col; btst r3, #8; bne next_col; ensures that we do not go beyond the top edge of the "display" and do not try to write something at 0x9FFx.

Now the drops are flashing as intended:

Complete code

 reset: jmp start; nop; ext_int: reti; nop; nop; nop; div_zero: reti; nop; nop; nop; illegal: reti; nop; nop; nop; rand: push r1; ld.w r0, seed; ld.w r1, #33797; mulu; addq r2, #1; st.w seed, r2; pop r1; move r0, r2; add_position: ld.b r2, #position; add r2, r1; ret; start: ld.w r0, #0x2000; move sp, r0; // set random positions ld.b r1, #32; init_loop: jsr rand; st.b (r2), r0; addq r1, #-1; bne init_loop; busy_loop: ld.b r1, #3; ld.b r2, flag; sub r1, r2; st.b flag, r1; ld.b r1, #32; next_col: jsr add_position; ld.b r0, (r2); ld.b r3, flag; lsr r3, #1; lsl r3, #2; add r0, r3; btst r0, #8; beq save; jsr rand; save: st.b (r2), r0; addq r1, #-1; bmi busy_loop; move r3, r1; // x (0..1f) lsr r3, #3; // byte addr in row (0..3) ld.b r2, #0xfc; // y mask and r2, r0; // y * 4 (0..fc) add r3, r2; // byte addr in screen ld.w r2, #0xa000; add r3, r2; // byte addr in memory ld.b r2, flag; lsr.wt r0, r2; ld.b r2, #7; and r2, r1; // bit index in byte (0..7) lsl r2, #1; lsr r0, #2; roxr r2, #1; ld.b r0, (r3); // and now apply test r2; bpl blank; bset r0, r2; st.b (r3), r0; addq r3, #-2; addq r3, #-2; btst r3, #8; bne next_col; ld.b r0, (r3); bset r0, r2; jmp apply; blank: bclr r0, r2; apply: st.b (r3), r0; jmp next_col; seed: dw 1; flag: db 2; position:;

Now, in order to try to launch your own program on Megaprocessor, you need to agree with its creator about a visit to his home; but in a month, according to him, Megaprocessor will move to the Cambridge Computer History Center , and will be available to the general public five days a week.

Successes in megaprogramming!

Source: https://habr.com/ru/post/309654/

All Articles

We program "Megaprocessor"

More articles: