Architecture and Programming RCA Studio II

“Finally, we’ve come to the instruction!”
/ from CDP1802 microprocessor article /

In the early 1970s, simple electronic games like Pong were very popular in the United States (in the USSR, their counterparts went on sale in 5-10 years). As a rule, such games did not have a microprocessor and memory in the modern sense of these words, but were built on rigid logic. Accordingly, replaceable cartridges did not make much sense, and where they were - it was just a set of jumpers that included the desired game.

In 1977, two consoles were released almost simultaneously: the Fairchild Channel F and RCA Studio II . These were the first game consoles in the form of full-fledged computers - with a microprocessor and programs on replaceable cartridges. The RCA Studio II prefix, which we are talking about, is a development not only of RCA , but of a specific person - Joseph A. Weisbecker .
The first such device, System 00 , also known as COSMAC FRED (1971) was a prototype and was not mass-produced.
')

The processor in it was implemented on the usual logic (in FRED2 - on two chips under the name CDP1801 R and U, which appeared in 1973). RAM was in the region of 256 bytes - 4 kb, in addition to FRED2 was built-in tape.

The first commercial implementation of the COSMAC architecture was the COSMAC ELF device. In 1976, ELF was positioned as a computer for radio amateurs (a series of articles was published in Popular Electronics) and was a small board with toggle switches, indicators, a microprocessor CDP1802 (the same 1801, but already in one chip) and 256 bytes of RAM. For it, there were additional expansion cards that allowed displaying graphics on a monitor (using a CDP1861 chip), an external keyboard and a tape recorder were connected. ELF II and VIP appeared on the basis of ELF with extensions. In the COSMACs ROM, there was a virtual machine called CHIP-8, sharpened for primitive games (commands for outputting and moving software sprites, generating random numbers, etc.) There were other primitive computers and terminals based on this architecture.

All these devices were direct predecessors of RCA Studio II and have an extremely close both hardware and software architecture.

RCA Studio II was released in 1977 and was then sold at a price of $ 150 ($ 600 for current money). As is often the case, the first on the market is not necessarily the most successful. In 2008, PC World magazine recognized this console as the worst gaming console of all time (which, in principle, not far from the truth). A black and white image of squares, no joysticks (two fields of 10 buttons instead) and a dozen games - to put it mildly pleased customers.

In addition, all the games (both embedded and sold on cartridges) were written in the pseudocode of the ST2 virtual machine (the same idea as with CHIP-8 in COSMACs), which is why it was very slow.

RCA managed to release about 64 thousand units of RCA Studio II, not counting the clones that appeared later (Toshiba Visicom, Conic M-1200, etc.). With the appearance of Atari VCS , the outdated RCA Studio II and Fairchild Channel F instantly dropped out of the fight.

CPU

As a chip maker, RCA chose its own product as a set-top box processor — the RCA microprocessor CDP1802 , operating at 1.78 MHz and manufactured using CMOS technology.

Its predecessor was the CDP1801 two-chip processor (fully compatible with 1802):

CDP1802 is known for its radiation-resistant version (silicon on sapphire), in which it was used, for example, at the Galileo interplanetary station, flying to Jupiter in the 1990s (there were 6 such processors), as well as in MAGSAT .

The processor has a rather tricky register usage pattern. It has one 8-bit battery D and sixteen 16-bit registers - R0..RF (R0-R15), each of which can become a command pointer, depending on the contents of the 4-bit register P (pointing to one of R), changeable team sep rn. In other words - there is no single PC in the processor!

In addition, any of R0 ... R15 can become an index (address). The selection is determined based on the value in the 4-bit register X (modified by the command SEX Rn), after which the selected R is considered to be index for some commands.

R0 is always used as an address register for DMA. Inside the interrupt, the instruction counter is R1.

There is an 8-bit T register, which is used to automatically save the X and P registers in it when an interrupt occurs. Interrupts are enabled by setting the Interrupt Enable (IE) interrupt flag via the RET or DIS commands.

The 4-bit registers I and N contain the instruction currently executed by the processor.

There is a register of flags - DF. More precisely, one flag, since it is single-bit and contains only the carry flag.

In addition, the processor has a one-bit Q output port, the state of which is changed by the SEQ and REQ commands.

As in many processors of that generation, the stack in the usual sense is absent here (there are no PUSH, POP commands, or a stack pointer) and, if necessary, is implemented by the existing instructions.

There are no traditional instructions for calling subroutines either. The transition to the subroutine is carried out using the instruction SEP Rn which, I recall, makes the specified Rn register the instruction counter. For the return, the same SEP instruction is used, but with the register that was the instruction counter before the call. Or (in a more universal, but slower version) is used MARK and RET.

In addition to traditional conditional and unconditional jumps (by the way - they are all absolute), there are several SKIP instructions that, if the condition is met, skip the instruction following the SKIP (two bytes). Provided and unconditional SKIP.

The 1802 processor is often mentioned as one of the first RISC processors. However, in the same context, for example, 6502 is mentioned, as well as some others. It is certain that the architecture is not quite ordinary and, from the point of view of programming, causes mixed feelings. On the one hand, there are already sixteen 16-bit registers. On the other hand, their content itself can only be reduced and increased by one. For example, putting a constant in Rn looks like this:

ldi $01 ; const -> D plo r6 ; D -> R6.0 ldi $02 ; const -> D phi r6 ; D -> R6.1

Therefore, the lion's share of the code is moving the bytes back and forth.

From the transitions by the condition there is, in practice, only a transition by zero (only the situation when 0 is in the battery D is considered) and the transfer flag. Typical cycles are as follows:

 loop: ... dec r7 ; R7-- glo r7 ; R7 -> D bnz loop loop: ... adi 2 ; D = D + const xri $07 ; compare using XOR. (D == const) -> D bnz loop

All arithmetic and logical instructions work only with battery D.

In addition to the one-bit port controlled by SEQ / REQ, there is also a four-bit one that is controlled by OUT / INP commands. Unfortunately, it is not used in RCA Studio II.

MEMORY

Available: 2 kB of ROM (BIOS + five built-in games) 512 bytes of RAM (half allocated for video)
Memory card

 000-2FF ROM RCA System ROM :   SP2 300-3FF ROM RCA System ROM : BIOS 400-7FF ROM   (    ) 400-7FF ROM  ( ) 1024 800-8FF RAM   (256 ) 900-9FF RAM  (256 ) A00-BFF ROM  ( ) C00-DFF ---  ,  800-9FF E00-FFF ROM  (  )

It is necessary to specifically stipulate that for games and programs on cartridges only part of the BIOS is available - the one that contains SP2 (unnecessary, by and large), images of numbers from 0 to 9 and the standard interrupt handler for video.

VIDEO

For graphics, use the RCA chip CDP1861 - the so-called "Pixie".

The standard RCA Studio II only has a regular antenna output (RF), however, people convert it into a composite one so that the quality is better (I almost wrote - “for better color rendering” :))

Technically, the video controller provides a maximum resolution of 64x128 with two colors (black and white). However, this requires 1024 bytes of video memory, and in Studio II the total amount of RAM is 512 bytes. Therefore, the resolution is 64x32 (which requires 256 bytes). The horizontal resolution (64) is fixed. In one line of 64 pixels, 8 bytes is always displayed, and this happens within 14 processor cycles.

To display the memory ($ 900- $ 9ff) on the screen, use the BIOS interrupt handler. The interrupt is initiated by the video controller and occurs 60 times per second (NTSC). The BIOS handler performs all the necessary operations — the executable program only needs to change the video memory in which each bit directly corresponds to a black or white dot (from left to right, top to bottom).

However, nothing prevents you from writing your handler. The simplest case is the resolution of 64x128, since it is natural for the video controller. For it, in the handler, it is enough to only write the address of video memory to R0 (where the data will be taken for display) and the bytes will start to be displayed via DMA themselves filling the frame. Vertical resolutions other than 128 are more complicated. There you will have to enter delays and duplicate data, changing R0 (see description of cdp1861 and BIOS sources).

In principle, you can even make a variable vertical resolution, not output anything to a part of the screen, and also specify ROM as a video memory, not RAM (or partially ROM, and partially RAM). You can also implement vertical scrolling by changing the initial address from which data starts to be issued to the controller.

Note that at the INT output of the video controller, the unit appears two lines before the beam reaches the visible area. Therefore, the interrupt handler usually starts with a delay that allows you to start mapping memory on time.

The video controller also has an EFX output, on which 0 appears over 4 lines before the beam appears in the visible area and then throughout the 4 last lines of this area. The EFX output is connected to the EF1 of the processor and its status can be checked with command B1 (BN1).

Typical waiting for the return path on the frame is implemented as follows:

 ... delay: bn1 delay ; wait for EFX in video chip ...

As already noted above, in the ROM there are no images of letters and characters. However, there are still numbers (after all, in built-in games you need to somehow show points and player numbers). However, even here they managed to save:

As you can see, the figures are molded so that the remaining ones are made from several adjacent ones.

SOUND

Let's just say - there is sound. But not more. The NE555 is attached to a single single-bit output port of the CDP1802 with a harness, and then it is all connected to the speaker built into the console. When a unit is fed to the RST NE555 (command of the SEQ processor), it starts squeaking at 625 Hz. When zero (command REQ) - squeaks. Actually that's all. However, there is still a capacitor due to which, at the beginning of a squeak, the frequency within 0.4 seconds smoothly decreases twice (i.e., some additional screeching is obtained).

In the standard BIOS interrupt handler, in addition to the part responsible for the video, there is a piece that checks the contents of a specific memory cell and, if there is not zero, turns on the squeak and starts cyclically decreasing the contents of the $ 08CD cell (when zero is reached, the squeak turns off). Thus, you can not bother with self-recording in the port, but simply set the duration of the squeak and it will occur in the background, without stopping the program:

  ldi $8cd & $ff plo rf ldi 250 ;   str rf ...

The same can be done manually (after turning off the interrupts):

 ;   sex r3 ; set X to R3 dis ; return X to R5, P to R3, 0-IE, R3=R3+1 db 53h ; forces X=5 P=3 - which is no change ;   seq ;   ldi 250 ; delay plo r6 delay: dec r6 glo r6 bnz delay ;   req ;    sex r3 ; set X to R3 ret ; return X to R5, P to R3, 1-IE, R3=R3+1 db 53h ; forces X=5 P=3 - which is no change

PROGRAMS

In the 1970s, a little more than a dozen games and several other programs were written (mostly by RCA itself). Almost all of them were written not in assembler, but in pseudocode - a special interpreter-virtual machine ST2 is in the ROM of the device. It is difficult to say exactly what motivated this decision. Most likely, the idea was to save memory - the games really turn out to be significantly less in volume. In general, ST2 ears grow from a similar VM called CHIP-8 , used in COSMACs. Although both VMs are incompatible with each other, already in the 2000s a CHIP-8 interpreter for RCA Studio II was written. Given the extreme similarity of the architectures, it is not surprising that, as the interpreter's author writes, games with COSMACs that did not require a lot of memory were launched without problems on RCA Studio II.

Alas, the VM on such an architecture works very slowly, which leaves an indelible imprint on the games themselves. Later, in 2013, Paul Robson wrote about a dozen more games - already in assembler and distributed them with source code.

DEVELOPMENT

Initially, according to witnesses, the development for RCA Studio II was carried out even without an assembler - on COSMAC ELF and FRED2.

At present, there is no need to suffer this way. There is a decent emulator under Windows - Emma , with a good debugger (by the way, it emulates not only RCA Studio II, but all COSMACs as well).

As an assembler, I first tried to use a18 a cross-assembler, but for several reasons, I ended up with asmx , which also has Python scripts for generating the finished image of the cartridge (it has the .st2 extension).

A brief introduction to the 1802 assembler can be found here . The simplest test.asm for RCA Studio II with an infinite loop will look like this:

  .include "1802.inc" .org 400h .db 4,2 ; SYS $402 start: br start ; some code .end

Pay attention to the instruction ".db 4,2". This is the address of the first instruction to be executed, i.e. ".db> (start), <(start)".

Implementing a simple loop:

  ldi 50 ;    D   plo r6 ;    D     r6 loop: dec r6 ; r6 = r6 -1 glo r6 ;     r6   D bnz loop ;    loop    D

Using SKIP instructions:

 ; q = 0   $FF00 ,  q=1   $FF  loop: ghi r1 ; hi(r1) -> D lsz ;   2 ,   D  (..   seq) req ; 0 -> Q skp ;   1     (..   inc r1) seq ; 1 -> Q inc r1 ; r1 = r1 + 1 br loop ;  ...

To practice in the pure CDP1802 assembler, it is convenient to use the asm80 assembler-emulator online . The extension of the source file being created must be .a18

To run the finished application on a real gland in nature there is a cartridge RCA Studio II 40th Anniversary Multicart . I did not have it, but tnt23 redid one of the existing cartridges with the game under the EEPROM AT28C16 chip (2k x 8) installed in the socket.

So, to run on a piece of iron, I inserted the chip into the programmer each time, flashed it, rearranged it into a converted cartridge, turned on the set-top box. And so every time.

INTRO "NO SHADERS"

In order to master the platform, I wrote 256 bytes intro (presented at the Chaos Constructions'2018 in the Tiny intro competition).

Unlike, say, from Vectrex , where you can get a spectacular picture even by simply drawing a curve or from Videopac , where the ROM already has a set of images of little men, here we have a sad situation - an ordinary, familiar, raster graphics, but black and white and the resolution is nowhere below (64x32). In the ROM, there is not that pictures, but even characters. Sound - and that is limited by the frequency of 625 Hz.

Thus, the music was canceled, all types of plasma, lights and in general everything that implies non-square contours. The text in any form was also canceled - there would not be enough space for letters.

As a result, it was decided to a) scroll b) something repetitive c) at different speeds. It turned out like this:

As mentioned above, there is no hardware scrolling in the video controller. However, low resolution and black and white not only have minuses, but also pluses - less bytes are rewritten.

I scrolled line by line, via the shlc command (left shift with transfer) - when executed in a loop, it turns out that the leftmost bit from the next byte is shifted to the left and does not disappear, but placed in the carry flag (DF). Accordingly, the next shlc in the loop picks it up and places it in the byte to the left. It turns out a simple scrolling of the entire line, which will scroll eight times in a cycle (since it is convenient to take patterns of clouds and houses byte-by-byte)

 ... scrollret: sep r3 ; return from subroutine ;   scroll: ; set lines counter ldi LINES ; const -> D plo r10 ; D -> Rn.0 nextline: ; set bytes counter ldi BYTES_PER_LINE ; const -> D plo r7 ; D -> Rn.0 ; set carry to scroll glo r12 ; Rn -> D shr ; get one bit to set carry plo r12 ; D -> Rn.0 (save shifted byte) nextbyte: ldx ; Rx -> D shlc ; D = D << 1 (carry -> DF) stxd ; D -> M(Rx), Rx-- dec r7 ; Rn-- glo r7 ; Rn -> D bnz nextbyte dec r10 ; Rn-- glo r10 ; Rn -> D bnz nextline ; one line (8 bytes) scrolled, let's scroll next br scrollret ...

Note that the entry point to the subroutine is on the scroll label, and to return, it is not just sep r3 that is executed, but first the br scrollret and already from there sep r3.

This is done in order to leave r14 (which is the instruction counter inside the subroutine) in the correct state, then the subroutine can be called again and again (using sep r14).

Of course, no variables are saved here when calls are made - all registers-variables are global.

The scroll subroutine is called twice in the general cycle - every second time for houses and every fourth time for clouds (they scroll slower). The overall cycle is synchronized in the reverse direction of the beam (the road, at home, the clouds - they have time to draw, the stars are static). In the case of the road, only one line scrolls - the edges of the road are simply drawn with lines.

I, for the sake of interest, tried to scroll the entire screen — it doesn’t fit a bit in time.

Houses are given by patterns:

 ... house1: .db %00000000 .db %11111111 .db %10101010 .db %11111111 .db %10101010 .db %11111111 .db %00000000 .db 1 house2: .db %00000000 .db %00011111 .db %01110101 .db %01011111 .db %01110101 .db %00011111 .db %00000000 .db 1 ...

and a sign with a link to each:

 ... commands: .db house5 .db house2 .db house1 .db house3 ...

In the loop, this label is sequentially sorted.

Unlike houses, both clouds, for simplicity, are a single pattern that simply cycles.

A certain number of bytes could be won due to the output of clouds on the same principle as the houses, as well as due to the program intervals between the patterns (now they are just repeated zeros in the data).

The problem, however, is that the part of the registers is used by the interrupt handler - R0, R1, R2, R8, R9, R11 cannot be changed. And storing variables in memory is a lot of extra bytes on their writing and reading (not to mention the ticks).

Ideally, you probably should have scrolled in the interrupt handler. However, for this you would have to write your handler instead of the standard one. It would be more correct (and, in passing, could release a couple of R registers), but, most likely, in the end, everything would not fit into 256 bytes.

As for the stars, they are static, however, to draw a few points that look randomly located, suddenly turned out to be not so simple:

 ... loop: ldn r4 ; M[Rn] -> D ani %00000010 ; D AND const -> D bdf skip ; jump if carry ldi 0 ; const -> D skip: stxd ; D -> M(Rx), Rx-- glo r4 ; Rn -> D adi 47 ; D + const -> D plo r4 ; Rn -> D glo r15 ; Rn -> D bnz loop ...

Here, in the cycle, data is taken from the BIOS, which is thinned out and extra bits are masked. Mask (for ani) and step (for adi) are selected manually.

As for the sound, the impossibility to change the frequency is simply imitated by the “beeps” of the car.

By the way - I believe that this intra is the first demo-stage work for RCA Studio II :)

EPILOGUE

After Studio II, RCA released several copies of RCA Studio III . Differences in two things - the color appeared (the resolution did not change) and the sound became better (you can give out not one, but 255 different frequencies).

Interestingly, both machines are compatible with each other in both directions, including through the use of the same intermediate code with the interpreter.

It is also known that there were plans for RCA Studio IV. There, the resolution should have increased to 64x128 and even a new pseudo-code interpreter was already written.

As for the CDP1802, this microprocessor continues to be released - first it was made by Hughes, then Intersil (Renesas)

Those who wish to learn more about this peculiar branch of the history of the development of computing technology, I recommend to google the words " COSMAC" and "CDP1802 ".

Additional links

RCA Studio II Technical Information
RCA Studio II circuit
Microprocessor CDP1802 Description
CDP1861 video controller description
My other work for retrocomputers
My story about RCA Studio 2 on Chaos Constructions

Source: https://habr.com/ru/post/422277/

All Articles