We write the Gameboy emulator, part 3

Hello!

In the previous part of this series of articles, we completed work on the critical components of our emulator. To complete the picture in this article, we will look at the DMG sound system.

We write the Gameboy emulator, part 1
We write the Gameboy emulator, part 2
We write the Gameboy emulator, part 3

Before you begin, here is a link to the Cookieboy repository, where you can find its source code and the latest build.
')

Sound system
Sound channels 1 and 2

Sound Channel 3
Sound Channel 4
Implementation
Testing
What's next
Conclusion

Sound system

DMG allows you to output stereo sound by mixing 4 independent audio channels. Modulation components are connected to each channel, which the game can control to produce the necessary sound. There are three components in total and the assignment for all channels is identical:

Sweep unit. Changes the frequency of the sound with a specified period and step.
Length counter. Controls the duration of the audio output.
Envelope unit. Changes the volume of a sound with a specified period.

Each channel provides a number of registers that allow you to control these components and the channel itself. They (registers) are numbered in a certain way - NRXY, where X is the channel number (1, 2, 3, 4), Y is the register number. Where necessary, I will omit the channel number and just write X.

The following audio channels are available on DMG:

Rectangular wave. Contains all three modulation components.
Rectangular wave. Contains volume and duration control components.
Wave of arbitrary shape. It contains only the duration control component. Volume is set in one of the registers manually.
Noise generator Contains volume and duration control components.

The first two channels are identical and differ only in the set of modulation components.

The third channel allows you to play an arbitrary waveform from a special memory area in the I / O ports - Wave Pattern RAM section. Thus, it is possible to reproduce the digital sound of arbitrary content by timely updating the specified memory area. Some manage to reproduce something similar to speech.

The fourth channel allows you to generate noise of a different nature. Well suited for the sound of various special effects.

Here is a simplified sound generation scheme in DMG:

After passing through all the modulation components, the audio signal enters the mixer, which mixes various channels and outputs them to one of the outputs. S01 - right ear. S02 - left ear. The mixing operation is reduced to a simple addition of signals from all sources for a particular output - the NR51 indicates where and which channels should be output. Next, the volume for each of the outputs is taken into account - the signal after mixing is multiplied by the volume value of this output in the NR50 register plus 1.

You should not try to fully understand this scheme - along the way, everything that is drawn on it will be considered in more detail.

NR50 and NR51 are general registers. In addition to them, there is a general register NR52, which contains the mute flag of the entire sound, as well as bits indicating the status of the audio channels. You can only change the mute flag. The status bits are read only and are constantly updated.

If sound is muted in the NR52 register, the following happens:

All registers are reset, except for the counters Length counter. This means that only the bits related to the Duty cycle need to be cleared (further it will be clear what I mean).
It is forbidden to write to all registers except NRX1. Moreover, the recording can be carried out only in those bits that relate to Length counter.

Speaking of the availability of register bits. Pay attention to which bits are not used or not available for reading. When trying to read registers from the outside, all bits that are not used or cannot be read can be set to one. Test ROMs check this. To change the register itself is not necessary in any case. Sound components naturally have full access to the registers.

All components that form the sound are synchronized with the clock generator. To generate sound waves of a certain frequency, the clock generator itself is used. For the modulating components, a separate Frame Sequencer clock generator, operating at a frequency of 512 Hz, is allocated. It also works from the main clock generator, but allows you to generate low-frequency samples. For the Sweep Unit the frequency is 128 Hz. For Length Counter - 256 Hz. For the Envelope Unit - 64 Hz. Here is what the process of this clock generator looks like, where each row means one Frame Sequencer count:

Length counter	Envelope unit	Sweep unit
Countdown	-	Countdown
-	Countdown	-
Countdown	-	-
-	-	-
Countdown	-	Countdown
-	-	-
Countdown	-	-
-	-	-

The table indicates which Frame Sequencer samples generate samples for the modulation components. It turns out that it cyclically passes such a sequence of samples, which gives us 8 possible states of the Frame Sequencer (let's number them from 0 to 7). Here it is important to consider the phase with which the samples are counted. It is also worth considering that when you start the sound (the flag in the register NR52), the Frame Sequencer starts from state 1. It is very important to let the modulation components know that you have changed the state of the Frame Sequencer. One time I barely found this error, because of which one of the test ROMs could not pass.

Having dealt with the general device, we proceed to the consideration of each specific channel.

Sound channels 1 and 2

First, consider the channels that generate square waves. It does not make sense to separate channels 1 and 2 here. Having considered channel 1, it will be possible to realize channel 2 by simply cutting back the functionality, since they are identical except for the Sweep unit.

And so, what is a square wave. At the bottom of the figure is just such a wave.

It is not particularly important where the time axis is. In my emulator, a wave is used that generates the “there is a signal” segments (1-2, 3-4, 5-6), “no signal” (0-1, 2-3, 4-5). It could be done differently and put the time axis in the middle, but this will only complicate the implementation, and the result will be identical.

In this figure, the signal's duty cycle is 2, because segments with different amplitudes have the same duration. DMG allows you to generate square waves with different values of the duty cycle, although the documentation used to use the inverse of the duty cycle - duty cycle. It is by the way and more visual, and we will use it. The choice is given from 4 different values of the fill factor - 0.125, 0.25, 0.5, 0.75. The fill factor does not affect the frequency, but only the nature of the signal. The figure below shows the signal differences for different fill factors and the same frequency.

4 values are provided, although, in fact, the values of 3 — fill factors of 0.25 and 0.75 give different-looking waves, but their sound is identical. When playing a sound, the value has a change in amplitude, which has the same character at fill factors of 0.25 and 0.75.

The fill factor value is contained in the NRX1 register in the upper two bits.

Naturally we need to know what frequency the signal should be generated. For this, the NRX3 and NRX4 registers are used. The frequency is indicated by a number of 11 bits in length — the lower 8 bits are contained in the NRX3 register, the upper 3 bits in the NRX4 register. Thus, the frequency can be in the range from 0 to 2047, but these values do not refer to the actual frequency of the sound. To translate these values into real frequency, you must use the following formula:

F = 4194304 / (32 * (2048 - X)) Hz,

where X is the frequency of the NRX3 and NRX4 registers, F is the sound frequency.

Thus, the sound frequency lies in the range from 64 Hz to 131 072 Hz. There is no need to worry about such high frequencies - not only will it be quite difficult for us to properly generate sound of such a frequency (according to the Kotelnikov theorem, the sampling frequency should be more than 262,144 Hz); so everything is complicated by the fact that our technology is not able to reproduce this, and our ears are not able to hear. A more realistic range is limited to 22,000 Hz - this roughly corresponds to the upper limit of the dynamic range of human hearing and it is not at all accidental for most speakers. And for such frequencies, the usual sampling rate of 44,100 Hz is sufficient.

The formula above is usually given in the documentation as given, but it would be nice to understand why it is calculated that way. Let's look again at the sound system operation scheme, there is a component of the Wave generator. It contains a timer with which the wave of the desired frequency is generated. The period of this timer is 4 * (2048 - X). In order for a wave to go through a full period, the timer must make 8 counts, which gives us the cherished 32 * (2048 - X) - this is the value of the full wave period.

The mentioned timer with proper implementation will allow you not to worry about frequency translations. If the timer in the emulator is synchronized with the processor in the same way as all the other components,
then everything will work by itself. Formula 4 * (2048 - X) gives the period of the timer in cycles.

For 8 counts of this timer, the sound wave will pass a full period. Now back to the fill factor. It dictates the nature of the wave change during its period. The following values are given in the documentation (1 and 0 in the right column means, respectively, “there is a signal” and “no signal”):

Fill factor	Single waveform
0.125	00000001
0.25	10,000,001
0.5	10000111
0.75	01111110

In addition to the frequency in the register NRX4 stored and other data. That's the way its structure:

Bits	Purpose
7	Channel restart
6	Endless / End Playback
2-0	Lower 3 bits of frequency

If bit 6 is cleared, the sound is played endlessly. If the bit is set, then the Length Counter enters.

Bit 7 restart is what it does. If 1 is written to it, then the channel is restarted. This may seem strange for a channel that reproduces an infinite periodic signal, but in reality it is more important for the modulation components. About them later. In addition, the above-mentioned waveforms make it possible to generate the correct signal — when the channel is restarted, the period of the signal also starts from the very beginning according to the specified forms.

That's all you need to know about channels 1 and 2 in general. Then the modulation components come into play. Although each channel contains its own modulation components, their operating principle is identical. Now we will consider all the modulation components (channel 1 contains all of them) in order not to repeat.

Sweep unit

This component controls the frequency of the signal. It works in two modes - increase or decrease frequency. Different periods and step sizes are supported. The component is controlled by the register NRX0. Here is its structure:

Bits	Purpose
6-4	Period: 000 - component is off 001 - 1/128 with 010 - 2/128 with 011 - 3/128 with 100 - 4/128 with 101 - 5/128 s 110 - 6/128 seconds 111 - 7/128 with
3	Mode: 0 - increase in frequency 1 - frequency reduction
2-0	Step

The periods are specified in milliseconds, and they should be translated into cycles for timing, but with proper implementation of the Frame Sequencer, we will not need it. Sweep unit operates at a frequency of 128 Hz, so the periods are not accidentally calculated relative to 1/128 - this allows you to forget about the manual counting of cycles. This is the case with the other components - the Frame Sequencer considers everything, the rest does not need to “worry” about anything.

Now step. It makes no sense to explain it, it is easier to give a formula by which the next frequency value is calculated at the next reading:

F (t) = F (t - 1) ± F (t - 1) / 2 ⁿ ,

where F (t) is the next frequency value, F (t - 1) is the current frequency value, n is the step value from the NRX0 register. I note that we need to use not the division, but the bit shift to the right by the number of steps from the register NRX0.

The figure below shows the operation of the Sweep unit with NRX0 = 0x61:

Frequency change occurs continuously until one of the limits for the frequency value is reached or someone disconnects the Sweep unit. If the frequency reduction mode is on and the next frequency is negative, the previous value is saved and the calculations are terminated. If the mode of increasing the frequency is on, and it has passed for the maximum value (2047), the channel stops, and in the corresponding bit of the NR52 register status a zero is written, indicating that the channel is stopped.

This ends simple things and begins to unobvious details. Sweep contains several hidden registers that are not accessible from the outside - the internal enabled flag and the frequency shadow register. It also contains a counter to maintain the period specified in the register NRX0.

I have already mentioned that the restart bit in the NRX4 register, also called the trigger, is important for the modulation components. When installed in the Sweep Unit, the following occurs:

The channel frequency (NRX3 and NRX4) is copied into the frequency buffer register.
The counter is reset. To do this, you need to copy bits 6-4 from the NRX0 register, i.e. the counter will be the number of samples at a frequency of 128 Hz. The Frame Sequencer will generate samples at this frequency, so the meter must match it. As you can see, there are no unnecessary conversions, if everything is done correctly.
The activity flag is set if the period or step is not zero. Otherwise reset.
If the step is not zero, then a new frequency is calculated and its check for overflow (no more than 2047), but the new frequency is not saved - everything is done just to check for overflow.

And so, what happens when the meter "says" that it is time to update the frequency. First we reset the counter. Then we check the activity flag - if it is set, the new frequency is calculated according to the formula above with the difference that the frequency buffer register acts as F (t - 1). Immediately check for overflow - if the new frequency has exceeded 2047, the channel is turned off.

If there was no overflow and the step is not zero, then the new frequency value is written to the NRX3 and NRX4 registers, as well as to the frequency buffer register. Immediately, another calculation of the new frequency and an overflow check occurs, but this frequency is not saved - this is all done only for the sake of another overflow check.

Register-buffer frequency here will not work. Its presence leads to the fact that full-fledged manual control of the channel frequency during the operation of the Sweep unit is impossible. We can change the frequency ourselves, but it will remain so until the Sweep unit is counted - because it uses frequency register-buffer in calculations, our frequency value will be ignored and overwritten, and the calculations will continue as if nothing is it happened.

Now there are two oddities that the modulation components of the DMG abound:

As mentioned earlier, a period with a value of 0 means that the Sweep unit is disabled. This is logical and should be just that, but in reality, for DMG, a period with a value of 0 in the NRX0 register means that the period is 8. No frequency calculations occur, just the Sweep unit is idling. Test ROMs check this.
Imagine such a scenario. The game has set the mode to reduce the frequency. It took some time and time to calculate the new frequency. If after this the game tries to set the mode to increase the frequency, the channel will be immediately turned off. Thus, if in the frequency reduction mode at least one calculation of a new frequency has occurred, changing the mode to increase the frequency disables the channel.

Length counter

This component is the simplest counter. It measures a certain number of samples, and then disables the audio channel connected to it. This component is present in all channels and uses the NRX1 register for its operation — it stores the channel playback duration. I will not give its structure - all channels have a different number of bits for the duration. In addition, for the Length Counter is set to bit 6 in the register NRX4, which on / off the component.

Frame Sequencer generates samples for this component at a frequency of 256 Hz. The values in the NRX1 register are indicated in counts at such a frequency that, once again, means freedom from conversion. At each count, the changes are written back to the register — the NRX1 register and is a counter.

Before using the duration value from the register NRX1, it must be converted by the following formula:

Counter = (~ NRX1 & Mask) + 1,

where Counter is a counter, Mask is a mask with the help of which only the value of the duration is extracted from the register (sometimes it also contains the fill factor). This will give us the number of samples at a frequency of 256 Hz.

It seems that everything is extremely obvious - an elementary counter. At the next count, we check the flag in the NRX4 register for truth and, if successful, mark the count in the component counter. If the counter reaches the final value, the channel is disabled. Difficulties arise when implementing the next oddities - they are all concentrated in the NRX4 register change processor:

If the component was turned off and is now turned on by means of bit 6, the counter has not reached zero, and the current state of the Frame Sequencer is counting our component (all even states), then we immediately carry out the counting. This can lead to the fact that the counter reaches zero and the channel is turned off, but here one more condition must be taken into account - the channel is disconnected here only if the channel is not restarted, i.e. the restart bit is zero.
If a restart is performed and the counter reaches zero, then the maximum possible value is written to the NRX1 register, i.e. all bits of the duration are reset (see the formula above). If at the same time the component is turned on (the previous value is not important) and the current state of the Frame Sequencer is counting our component, then we immediately count it. There is no need for any conditions here - the channel has disconnected, which means it has disconnected.

Envelope unit

This component controls the volume of the sound, reducing or increasing it with a constant pitch with a certain period. Loudness in this case means the amplitude of the generated signal. This component is controlled by the register NRX2:

Bits	Purpose
7-4	Initial amplitude value
3	Mode: 0 - decrease 1 - increase
2-0	Period

The Frame Sequencer generates samples for this component at 64 Hz. The values in the NRX2 register are indicated in samples at that frequency. At each count, the internal period counter counts. When it counts one period, the amplitude value increases / decreases by one. When the boundary values are reached, the calculations are terminated. At amplitude 0, the channel is obviously muted, but active.

The figure above shows a graph of the signal amplitude change when the Envelope unit is working with the NRX2 value equal to 0x55. The initial amplitude value is set when the channel is restarted ( trigger ) - in this case, it is equal to 5. During operation, it is no longer used and not modified. Further, with each period, the amplitude decreases by one until it reaches zero.

Now the next oddities. First, when modifying the NRX4:

As for the Sweep unit, period 0 means that the period is 8. Again, no calculation takes place - the component runs idle.
If the channel is restarted, and the next state of the Frame Sequencer generates a count for the Envelope, then the period counter is set to one more than it should be.
If the channel is restarted, the initial amplitude value is zero and the amplitude reduction mode is set, the channel is immediately disabled.

Now when modifying NRX2:

If the current (new value has not yet been recorded) NRX2 value contains a period equal to zero, and the period counter has not yet finished counting (we remember that period 0 means 8), then the current value of amplitude must be increased by one. Otherwise, we check the current mode - if this is a decrease, then the amplitude should be increased by 2. That is, This will all lead to an immediate increase in the amplitude of the signal generated by the channel.
If a mode change occurs, the amplitude is set to 16 minus the amplitude value.
After all operations, the amplitude value is truncated to the lower 4 bits.

The envelope unit has the strangest behavior of all, but, unfortunately, only the mentioned oddities are documented. The behavior on a real DMG is much more complicated, but no emulator can boast of its exact implementation.

Sound Channel 3

This audio channel generates a wave according to the content in the Wave Pattern RAM, which is located in the memory area of the I / O ports. Wave Pattern RAM is 16 bytes long and contains 32 samples. Each byte contains 2 samples - the first sample in the upper 4 bits, the second in the lower 4 bits. The memory contents are played cyclically with the frequency specified in the NR33 and NR34 registers (the structure of these registers is identical to channels 1 and 2).

This channel contains only the Length Counter of the modulation components - its operation is identical to the other channels and is controlled by the register NR31. The signal amplitude is adjusted manually. The amplitude value is set using the NR32 register, which uses only 6 and 5 bits. They may have the following meanings:

00: channel is muted, but active.
01: Wave Pattern RAM is played as is.
10: Wave Pattern RAM is played with each sample pre-shifted 1 bit to the right.
11: Wave Pattern RAM is played with a pre-shift of each sample 2 bits to the right.

In addition, the NR30 register in bit 7 contains a flag that permits playback of the sound (other bits are not used). If it is 0, it is prohibited. Otherwise - allowed. It is important to understand that this flag does not have a one-to-one correspondence with the status bit in the NR52 register for this channel. If bit 7 in the NR30 is set, then the sound can be played - the status bit for this channel in the NR52 can remain zero, and the sound will not be output. Reproduction is allowed, but not started. If the flag is cleared, this leads to the disconnection of the channel, which leads to the reset of the status bit in NR52.

The timer in the Wave generator works with a period of 2 * (2048 - X), where X is the frequency from the NRX3 and NRX4 registers. In this case, the frequency does not mean the frequency of the sound, but the frequency with which the next sample is read from the Wave Pattern RAM.

Channel 3, among other things, contains a pointer to the current sample and sample buffer - their presence in the emulator is mandatory for accurate emulation. At the next timer count, the pointer moves to a new position, and the current byte is copied to the sample buffer (obviously, the byte will be the same for every two counts). Next come the next oddities:

When the channel is restarted, the pointer is reset to zero, but the first byte from the Wave Pattern RAM is not copied to the sample buffer - this will happen only at the next countdown of the timer. This means that the first sample from the sample buffer, which still contains the old byte, will be played first; Wave Pattern RAM, , Wave Pattern RAM. . , , Wave Pattern RAM .
3 , Wave Pattern RAM , 3 . 0xFF, . 3 , 3 , , — , . 3 – Wave Pattern RAM.
Restarting channel 3 while reading a sample causes damage to the first four bytes in the Wave Pattern RAM. If the current sample pointer is within the first four bytes, then the first byte of the Wave Pattern RAM will be overwritten by the contents of the sample buffer. If the pointer of the current sample is in a different position, then all 4 first bytes will be rewritten with the contents of the four bytes (4-7, 8-11, 12-15) where the pointer is located. For example, if the pointer is 10 bytes, then the contents of the first four bytes will be rewritten by bytes 8-11.

With the first item everything is elementary. The rest is not so easy to implement, especially when there is no mention on the Internet of the intricacies of the implementation on the Internet. Their implementation in CookieBoy is a result of almost random attempts to manipulate a timer counter that moves the pointer of the current sample. That's what I managed to dig.

So.The key to the realization of the last two points is understanding what happens when the channel is restarted (trigger) by means of the NR34 register. Obviously, we need to reset the timer counter and the pointer of the current sample. The sample pointer is reset according to the first item above - everything is simple. With the counter, everything is not so simple, here lies the key to solving the problem.

Resetting the counter when the channel is restarted is an obvious and wrong decision. In fact, the counter is initialized in such a way that a delay occurs before the update of the position of the current sample begins. The delay equals the period of the timer (the formula I have already quoted) plus some constant (most likely not more than 8 cycles), which you will have to choose yourself. Those.instead of counting one period and updating the position of the pointer, the timer counts two periods plus some constant. After this, the timer operates in the normal mode, counting down the set one period.

This is how it works in my emulator. The ClockCounter variable is a clock count. It has a sign type. As soon as it reaches a value equal to the timer period, I update the position of the current sample pointer and reset the counter (subtract the period value from it). When the channel is restarted by NR34, I set ClockCounter = -Period - 3, where Period is the period value of the timer in ticks according to the formula given earlier, 3 is the same magic constant. This gives the necessary delay and allows you to know at what point in time you can read / write Wave Pattern RAM. If at the time of reading or writing to Wave Pattern RAM the variable ClockCounter is 3, then these operations are available. Otherwise, we return 0xFF.

Now the sample pointer. When restarting, I write into it 1. It is this combination of delay and sample pointer values at restart that allows you to pass test ROMs. Do not forget only about the fact that the second sample being played after the restart is the second sample in Wave Pattern RAM. Because of the delay, the old contents of the sample buffer (see the first oddity) and then the third sample from Wave Pattern RAM will be lost twice. This is a feature of my implementation, so as soon as the timer after the restart of the channel passes the entire delay (becomes non-negative), I update the contents of the sample buffer.

With the damage of the first four samples, everything is elementary, only now the ClockCounter must be equal to 1 in order for the first pattern of Wave Pattern RAM to be damaged and overwritten.

Do not forget that restarting the channel is not just an entry in the NR34. All of the above and the restart itself occurs only when the high bit NR34 is written 1 and the register NR30 allows playback (the high bit is set).

Sound Channel 4

This channel generates noise. Length counter and Envelope unit are connected to it - their behavior is no different from that in other channels. Under them reserved the same registers - NR41 and NR42, respectively. This channel does not contain the frequency in the usual sense - NR43 is used for completely different purposes, and NR44 contains all the usual flags, but the bits for the frequency are not used.

The noise generator is based on the so-called LFSR - Linear Feedback Shift Register or a linear feedback shift register. This is a pseudo-random bit sequence generator. The principle of its operation is quite simple.

The shift register is a repository for a bit sequence of a certain length (in DMG, the shift register can be 7 or 15 bits long). Certain bits of the shift register are marked as taps — it is thanks to them that a sequence is generated. In DMG, the taps are 0 and 1 bits of the shift register. For continuous operation, the LFSR uses a clock generator that generates samples to calculate the next bit of the pseudo-random sequence.

At the beginning, the shift register is initialized with any non-zero bit sequence — if all the bits are equal to zero, then we will always get zero at the LFSR output. At the next count, the following occurs:

The taps are summed modulo 2 (XOR operation), and the result is saved for further operations.
( 0) .
.
2, .

The output is a pseudo-random bit sequence. It is pseudo-random due to the fact that it has a period - from a certain moment the whole sequence loops. The period length (T) is calculated by the following formula:

T = 2 ^N - 1,

where N is the length of the shift register in bits. The period is determined by the maximum number of different states of the shift register except for one, when all bits are zero. Thus, for a 7-bit register, the period will be 127, and for a 15-bit register it is 32767. This leads us to the question whether to calculate everything honestly or use pre-generated sequences. The result will be identical, since the LFSR is looped through guaranteed. I used the second approach. The sequences can be found in the files LFSR7.inc and LFSR15.inc.

To control the LFSR, the NR43 register is used. Here is its structure:

Bits	Purpose
7-4	Timer frequency offset: 0000: 1/2 0001: 1/2 ² 0002: 1/2 ³ 0003: 1/2 ⁴ ... 1101: 1/2 ¹⁴ 1110: not used 1111: not used
3	Shift register length: 0: 15 bits 1: 7 bits
2-0	Frequency multiplier: 000: 2 001: 1 010: 1/2 011: 1/3 100: 1/4 101: 1/5 110: 1/6 111: 1/7

With the length of the shift register, everything is clear. The remaining bits are used to calculate the frequency of the LFSR clock. It is calculated (F) using the following formula:

F = f * Shift * Ratio,

where f = 4194304 Hz, Shift - timer frequency shift (values are shown in the table above), Ratio - frequency multiplier (values are shown in the table above). If the bits of the frequency shift are 1110 or 1111, then the LFSR does not receive samples, which means channel 4 is muted.

Implementation

To implement the sound, I chose SDL. This library has an extremely simple API for generating procedural sound - we indicate the sound parameters, the length of the sample buffer, the callback function and everything. The SDL automatically calls this function, where we “feed” it another batch of samples. After they are played, the function is called again, and so on. In addition to simple API, another advantage of SDL is good work with extremely small sample buffers, and latency is very important to us.

I will not go into the details of the implementation of the sound system components themselves. The theoretical part contains everything you need. Just touch the problem of synchronization.

The problem is that now we need to maintain not only the pace of the screen update, but also the rate of generation of samples. The SDL calls the callback function at equal intervals (although I did not see any guarantees in the documentation) and “expects” that we will record a new batch of samples. If these samples are not at the right moment, then we get intermittent sound. At the same time, it may turn out that the emulation rate is too high and the next portions of samples will have to be saved somewhere for playback later.

A ring buffer is best suited for storing samples. The emulator writes portions of samples to it, and the callback function takes them if necessary. The ring buffer solves several problems at once:

– . , ;
;
. , . , . , ;
. , . , , , callback- . .

The last point gives an interesting side effect - we can completely abandon the manual maintenance of the emulation rate (60 Hz). The necessary delays in the emulation will ensure that the callback function is called. For this, the SDL has conditional variables (SDL_cond). Using them, the thread goes into standby mode and waits for a signal from another thread that you can continue to work. For us, the waiting flow is the emulation flow — it waits for another flow (callback function) to take samples from the ring buffer and thereby free up space for the next batch. When we need the maximum possible emulation speed, we wait for no one and write to the ring buffer. Naturally, do not forget about mutexes.

Everything works so well for one simple reason - the generation of samples occurs at the same pace as the DMG processor.

Testing

As for other components, there are test ROMs for sound too. There is only one trick - for DMG and Gameboy Color test kits are different and it is worth running them all. DMG tests should be passed without errors, but the tests for the Gameboy Color real DMG passes with errors and displays the following:

If you run all the tests at once, they do not stop, but get looped, then there is nothing to worry about. This is the case when ROM tries to install a ROM bank that does not exist. If you cut the bank number, as I do, the tests loop. The same is observed in Gambatte, and you can trust him.

I recommend that you immediately arm yourself with the source code of the tests and understand how they work. This will speed up the process a lot, and sometimes it is the only way to understand what you need to do. Although the description of the error is displayed on the screen, it is sometimes difficult to understand what exactly is required of you and which registers are involved.

And so, the simplest tests are the “01-registers” and “11-regs after power”. The first is testing how reading and writing sound registers. I have already mentioned how it is necessary to take into account the unused and inaccessible for reading bits of registers - this is what the test checks. In addition, the results of operations at sound on / off are checked. The second tests the behavior of the registers when turning off the sound. The source code is written in more detail what is being tested.

"02-len ctr" tests the behavior of Length counter in boundary conditions. Testing takes place for each channel separately, the process displays the number of the tested channel.

“03-trigger” is another test for Length counter, but now its behavior is checked when modifying the NRX4 register. Basically, this is where all the mentioned oddities are tested. Length counter.

“04-sweep”, “05-sweep details” and “06-overflow on trigger” test the Sweep unit. In addition to normal operation, all the mentioned component oddities are tested here. To pass the “06-overflow on trigger” test, the following should be displayed:

“07-len sweep period sync” tests the correctness of the synchronization of the Sweep unit and Length counter. If the Frame Sequencer is implemented correctly, then there should be no problems with this test.

“08-len ctr during power” tests the length counter behavior when turning off the sound. To pass the test during it, the screen should display:

“09-wave read while on” tests read operations while channel 3 is running. To pass the test, the screen should display:

The screen displays the value from Wave Pattern RAM and FF because of the ban on reading. They alternate (00 FF, 11 FF, etc.), but at the beginning the read operation does not work twice and FF FF is output. This is exactly what is most difficult to achieve, and it required me to search through different values of the constant and counter (determining the moment when the read operation is allowed).

“10-wave trigger while on” tests damage to the first bytes of the Wave Pattern RAM when channel 3 is restarted while reading a sample. A lot of information is displayed on the screen, and it doesn’t fit completely. Here is the result of the passed test:

Even on this piece you can see where the damage of Wave Pattern RAM should occur.

“12-wave write while while” tests write operations to Wave Pattern RAM while channel 3 is working. Too much information is also displayed here, here is the result of the passed test: You

can see where and what to look for - the test tries to write the value 0xF7.

It is worth saying that passing these tests personally gave me nothing in the games I tested. And without their passage, the sound was normal, and judging by other emulators with enviable compatibility, failing all sound tests - it is not necessary at all. Although it's nice to know that the emulator works like real hardware. If there is a game dependent on the finer points of iron, then it will work correctly.

What's next

At the moment, our emulator supports all required and not quite DMG features. Naturally there is much to develop, namely:

Support serial port or game link.
Support for uncommon MBC controllers — Pocket Camera, Bandai TAMA5, Hudson HUC-3, Hudson HUC-1, vibration cartridges.

Finally, there is one feature of DMG, which none of the known emulators of me support, even the most accurate ones like Gambatte. This is a hardware bug that causes trash to be written to the OAM area. This feature is practically not documented and quite cunning in implementation.

The bug occurs when executing certain instructions of the processor with certain values of the operands. This is the easy part. It is more difficult to emulate the fact that garbage is not recorded by chance and has a certain content. It is even more difficult to emulate the fact that this bug occurs only during certain periods of operation of the LCD controller. It is for these details that I did not find any documentation - there are only test ROMs that do not really help. With them, you can implement this bug only partially.

Meaning in the implementation of this bug at least - it is unlikely that some developer consciously used it or left it in the release of the game. The same sound bugs due to Wave Pattern RAM are found in games, and they have practical meaning.

Conclusion

On this cycle of articles came to an end. As I said, the result is an emulator with good compatibility and support for all the most important functions. This is not a shame to put next to other implementations. And thanks to the use of test ROMs for implementation and testing, we can talk about high accuracy of iron emulation, and not just compatibility with games.

Naturally, the emulator has much to develop. In addition to the above, there is one more logical direction of the emulator development - Gameboy Color emulation (GB). DMG and CGB are not just consoles of the same family - they are almost identical internally. To the existing emulator it is necessary to add literally several modules.
Currently, Cookieboy does not emulate the CGB, but I plan to do this soon. This will also be an article.

If someone decides to implement their emulator, or something is not clear in the articles themselves, then you can contact us with questions, I will be happy to help. In the comments to the articles or habraposhku - it does not matter.

Source: https://habr.com/ru/post/156647/

All Articles