Quake III source code

[Translator's note: the translation of the first part of this article is already on Habré, but for some reason its author has not completed the work.]

Quake III Renderer

The Quake III renderer was an evolutionary development of the hardware-accelerated Quake II renderer: the classic part is built on the “binary split” / “potentially visible set” architecture, but two new notable key aspects are added:
')

A shader system built on top of a fixed OpenGL 1.X pipeline. This was a great achievement for 1999. It provided a large space for innovation in an era before the widespread today vertex, geometric and fragmentary shaders.
Multi-core architecture support: the OpenGL client-server model blocks some methods and the threading system partially solves this problem.

Architecture

The renderer.lib completely contained in renderer.lib and is statically linked to quake3.exe :

The overall architecture repeats Quake Classic: it uses the famous combination of BSP / PVS / lighting maps:

Preliminary processing:
1. The designer of the game creates using QRadiant .map and saves it.
2. q3bsp.exe cuts the map into binary partitioning space (BSP). I wrote about this in the Quake1 renderer review.
3. A system of portals is generated from BSP: I wrote about this in an article about the Doom3 Dmap tool .
4. q3vis.exe uses the portal system and generates PVS (potentially visible set) for each sheet. Each PVS is compressed and stored in the bsp file, as described in the previous article.
5. Portal system is cleared.
6. q3light.exe calculates the light for each polygon on the map and saves the result as a light map texture in the bsp file.
7. At this stage, all pre-calculated data (PVS and lighting maps) are stored in the .bsp file.
Lead time:
1. The engine loads the map and bsp.
2. When visualization is needed:
3. The engine unzips the PVS for the current sheet and determines what is actually visible.
4. For each polygon, it uses multi-texturing to combine a lightmap with color.

The stage of multitexturing and lighting maps is clearly visible if you change the slider and display only one or the other:

Texture drawn by level designer / artists:

Light map generated by q3light.exe :

The final result when connecting using multitexturing at runtime:

The rendering architecture was reviewed by Brian Hook at the Game Developer Conference in 1999. Unfortunately, video from GDC Vault is no longer available! [But it is on youtube .]

Shaders

The shader system is built on top of the fixed OpenGL 1.X pipeline, and therefore is very expensive. Developers can program vertex modifications, but also add texture passes. This is covered in detail in the Quake 3 Shader bible bible shaders:

Multicore renderer and SMP (symmetric multiprocessing)

Many do not know that Quake III Arena was released with SMP support using cvariable r_smp . Frontend and backend exchange information through the standard Producer-Consumer scheme. When r_smp is set to 1, the surfaces being drawn are alternately stored in a double buffer located in RAM. The frontend (which in this example is called Main thread ) alternately writes to one of the buffers, while the other reads the backend (in this example, it is called the Renderer thread ).

An example demonstrates how everything works:

t0-t1:

Main thread decides what to draw, and writes the surfaces in surfacebuffer1.
There is no data for the Renderer thread, so it is blocked.
The GPU thread also does nothing.

t1-t2: processes start everywhere:

The main thread decides what will be visible in the next frame. It writes the surface to the surfacebuffer2 buffer: this is a typical example of double buffering.
Meanwhile, the Renderer thread makes an OpenGL call and waits patiently until the GPU thread has copied everything to a safe place.
The GPU thread reads the surface from where the Renderer thread points.

Notice that at t2:

Renderer thread still passes data to the GPU: SurfaceBuffer1 is used.
Main thread finished writing to SurfaceBuffer2 ... but cannot start writing to SurfaceBuffer1: it is locked

This case (when the Renderer thread blocks the Main thread) often occurs when playing Quake III:
Let us demonstrate limiting the blocking of one of the OpenGL API methods.

After t2:

As soon as the Renderer thread finishes with SurfaceBuffer1 (t3), it starts pumping surfaces from SurfaceBuffer2.
As soon as it is unlocked (at t3), the Main thread starts working in the next frame, writing to SurfaceBuffer1.
In this configuration, the GPU is almost never idle.

Note: Synchronization is done through the Windows Event Objects in winglimp.c (the part with SMP acceleration below).

Network model

The Quake3 network model is, without a doubt, the most elegant part of the engine. At a low level, Quake III still abstracts data exchange with the NetChannel module that first appeared in Quake World . The most important thing to understand is:

In environments with a fast rhythm of change, any information not received during the first transmission is not worth re-sending, because it will still be outdated.

Therefore, as a result, the engine uses UDP / IP: there are no TCP / IP traces in the code, because “reliable transmission” creates an unacceptable delay. The network stack has been enhanced by two mutually exclusive layers:

Encryption using a previously transmitted key.
Compression using a pre-computed Huffman key.

But the most amazing design is on the server side, where the elegant system minimizes the size of each UDP datagram and compensates for the unreliability of UDP: the snapshot history generates delta parquets using memory introspection.

Architecture

The client side of the network model is quite simple: the client sends commands to the server every frame and receives game state updates. The server side is a bit more complicated, because it must transfer the general state of the game to each client, taking into account the lost UDP packets. This mechanism contains three main elements:

Master Gamestate is a common, true state of things. Clients send their teams to Netchannel. They are converted to event_t, which change the state of the game when it is received by the server.
For each client, the server stores the last 32 game states sent over the network in a cyclic array: they are called snapshots. The array is cyclically moved using the famous binary mask trick I described in the Quake World Network article ( elegant solutions ).
The server also has an “empty” state , in which each field has a value of 0. It is used for delta snapshots that do not have “previous states”.

When the server decides to send an update to the client, it uses all three elements in order to generate a message, which is then transmitted through NetChannel.

An interesting fact: storing such a number of game states for each player takes up a large amount of memory: in my measurements, 8 MB for four players.

Snapshot system

To understand the system of snapshots, I will give an example with the following conditions:

The server sends the update to Client1.
The server tries to transfer a state that has four fields to Client2 (three integer values for position [X], position [Y], position [Z] and one integer value for health).
Communication takes place via UDP / IP: these messages are often lost on the Internet.

Frame 1 server:

The server received several updates from each client. They influenced the overall state of the game (green). It is time to transfer the status to Client1 client:

To generate a message, the network module ALWAYS does the following:

Copies the general state of the game in the next slot of the client's history.
Compares it to another snapshot.

This is what we see in the next image.

The overall game state (Master gamestate) is copied with index 0 into Client1 history: it is now called “Snapshot1”.
Since this is the first update in Client1 history of correct snapshots, therefore, the engine uses an empty “Dummy snapshot” snapshot, in which all fields are set to zero. This results in a FULL update because each field is sent to NetChannel.

The most important thing to understand here is that if there are no valid snapshots in the client's history, the engine takes an empty snapshot to generate a delta message. This results in a full update sent to the client in 132 bits (each field is preceded by a bit marker ): [1 A_on32bits 1 B_on32bits 1 B_on32bits 1 C_on32bits] .

Frame 2 servers:
Now let's move a little bit to the future: here is the second frame of the server. As we can see, each client sent commands, and all of them influenced the overall state of the game Master gamestate: Client2 moved along the Y axis, so now pos [1] is equal to E (blue). Client1 also sent commands, but, more importantly, it acknowledged receipt of the previous update, so Snapshot1 was marked as confirmed (“ACK”):

The process is the same:

The overall state of the game is copied to the following client history slot: (index 1): this is a Snapshot2
This time we have the right snapshot in our client history (snapshot1). Compare these two snapshots

As a result, only a partial update is sent over the network (pos [1] = E). This is the beauty of this design: the process is always the same.

Note: since each field is preceded by a bit marker (1 = changed, 0 = did not change), 36 bits are used for the partial update from the example above: [0 1 32bitsNewValue 0 0] .

Frame 3 servers:
Let's take another step forward to see how the system deals with lost packages. Now we are in frame 3. Clients continue to send commands to the server.
Client2 suffered damage and health is now equal to H. But Client1 did not confirm the last update. It may be that the UDP server is lost, the client ACK may be lost, but as a result it cannot be used.

Despite this, the process remains the same:

We copy the general state of the game into the following client history slot: (index 2): this is a Snapshot3
Compare the last valid confirmed snapshot (snapshot1).

As a result, the message sends it partially and contains a combination of old and new changes: (pos [1] = E and health = H). Note that snapshot1 may be too outdated to use. In this case, the engine again uses "empty snapshot", which leads to a complete update.

The beauty and elegance of the system is in its simplicity. One algorithm automatically:

Generates partial or full updates.
In one message, resends the OLD information that was not received and the NEW information.

Introspection Memory on C

You may be wondering how Quake3 compares introspection snapshots ... because in C it doesn't exist.

The answer is the following: each field location for netField_t is pre-created using an array and smart preprocessing directives:

  typedef struct { char *name; int offset; int bits; } netField_t; //        ... #define NETF(x) #x,(int)&((entityState_t*)0)->x netField_t entityStateFields[] = { { NETF(pos.trTime), 32 }, { NETF(pos.trBase[0]), 0 }, { NETF(pos.trBase[1]), 0 }, ... }

The complete code for this part is in MSG_WriteDeltaEntity from snapshot.c . Quake3 doesn't even know what it compares: it blindly uses the index, the offset and the size of the entityStateFields and sends differences across the network.

Pre-fragmentation

Having gone deep into the code, you can see that the NetChannel module cuts messages into blocks of 1400 bytes ( Netchan_Transmit ), even though the maximum size of the UDP datagram is 65,507 bytes. So the engine avoids packet breaking by routers when transmitting over the Internet, because most networks have a maximum packet size (MTU) of 1500 bytes. Getting rid of fragmentation in routers is very important because:

When entering the network, the router must block the packet while it fragments it.
When you leave the network, the problems are even more serious, because you need to wait for all the parts of the datagram, and then collect them with a lot of time.

Messages with reliable and unreliable transmission

Although the snapshot system compensates for UDP datagrams lost in the network, some messages and commands must be delivered (for example, when a player leaves the game or when the server needs the client to download a new level).

Such binding is abstracted by the NetChannel module: I wrote about this in a previous post .

Virtual machine

If the previous engines gave the virtual machine only gameplay, then idtech3 entrusts it with much more important tasks. Among other things:

The visualization runs on the client virtual machine.
The delay compensation mechanism is entirely implemented in the client VM.

Moreover, its design is much more complex: it combines the protection / portability of the Quake1 virtual machine with the high performance of Quake2 native DLLs. This is achieved by compiling on-the-fly bytecode to x86 commands.

An interesting fact: the virtual machine was originally supposed to be a simple bytecode interpreter, but the performance was very low. Therefore, the development team wrote a runtime x86 compiler. According to the .plan file of August 16, 1999, this task was accomplished in one day.

Architecture

The Quake III virtual machine is called QVM. Its three parts are constantly loaded:

Client side: loaded two virtual machines. Depending on the state of the game, messages are sent to one of them:
- cgame : receives messages in the combat phase. Performs only clipping invisible graphics, predictions and manages renderer.lib .
- q3_ui : receives messages in menu mode. Uses system calls to draw a menu.
Server side:
- game : always receives messages, executes game logic and uses bot.lib for AI work.

QVM insides

Before we start using QVM, let's check how the bytecode is generated. As usual, I prefer to explain with illustrations and a short accompanying text:

quake3.exe and its bytecode interpreter are generated using Visual Studio, but the bytecode VM uses a completely different approach:

Each .c file (translation module) is compiled separately using LCC.
LCC is used with a special parameter, due to which the output is not carried out to PE (Windows Portable Executable), but to an intermediate representation, which is a text assembly of a stack machine. Each file created consists of text , data and bss with the export and import of characters.
The id Software special tool called q3asm.exe gets all the text assembly files and compiles them together into a .qvm file. In addition, it converts all information from text to binary (for speed, in case it is impossible to apply native converted files). Also, q3asm.exe recognizes methods called by the system.
After downloading the binary bytecode, quake3.exe converts it to x86 commands (not necessarily required).

LCC internals

Here is a specific example starting with the function that we need to run in the virtual machine:

  extern int variableA; int variableB; int variableC=0; int fooFunction(char* string){ return variableA + strlen(string); }

The module.c lcc.exe in the translation module.c lcc.exe called with a special flag to avoid generating a Windows PE object and perform output to an intermediate representation. This is the output file .obj LCC, corresponding to the above C function:

  data export variableC align 4 LABELV variableC byte 4 0 export fooFunction code proc fooFunction 4 4 ADDRFP4 0 INDIRP4 ARGP4 ADDRLP4 0 ADDRGP4 strlen CALLI4 ASGNI4 ARGP4 variableA INDIRI4 ADDRLP4 0 INDIRI4 ADDI4 RETI4 LABELV $1 endproc fooFunction 4 4 import strlen bss export variableB align 4 LABELV variableB skip 4 import variableA

A few notes:

The bytecode is divided into parts ( text , data and bss ): we clearly see bss (uninitialized variables), data (initialized variables), and code (usually called text )
Functions are defined using a sandwich from proc , endproc .
The LCC intermediate representation is a stack machine: all operations are performed on the stack and no assumptions are made about the CPU registers.
At the end of the LCC phrase, we have a group of files importing / exporting variables / functions.
Each announcement starts with a type of operation (for example, ARGP4 , ADDRGP4 , CALLI4 ...). Each parameter and result is transferred to the stack.
Import and export are here, so the assembler can "link" the translation modules together. Note that import strlen , because neither q3asm.exe nor the VM interpreter refer to the standard C library, strlen is considered a system call and is executed by the virtual machine.

Such a text file is generated for each .c file in the VM module.

Internals q3asm.exe

q3asm.exe gets the text files of the LCC intermediate view and assembles them together into a .qvm file:

Here you can see the following:

q3asm understands each of the import / export characters in text files.
Some methods are predefined in a text file of system calls. You can see syscall for the client VM and for the server VM . Symbols in system calls have attributes in the form of negative integer values so that the interpreter can recognize them.
q3asm changes the view from text to binary in order to get space and speed, but nothing more, no optimizations are performed here.
The first method you collect should be vmMain , because it is an input manager. In addition, it must be in the 0x2D text bytecode segment.

QVM: how it works

Again, a drawing showing a unique entry point and a unique exit point that dispatch:

Some details:

Messages (Quake3 -> VM) are sent to the virtual machine as follows:

Any part of Quake3 can call VM_Call( vm_t *vm, int callnum, ... ) .
VMCall can receive up to 11 parameters and writes each 4-bit value to the VM bytecode ( vm_t *vm ) from 0x00 to 0x26.
VMCall writes the message id to 0x2A.
The interpreter begins to interpret opcodes in 0x2D (where q3asm.exe recorded vmMain ).
vmMain used to dispatch and route a message to the corresponding bytecode method.

The list of messages sent by the client VM and server VM is presented at the end of each file.

System calls (VM -> Quake3) are performed as follows:

One after another, the interpreter executes VM opcodes ( VM_CallInterpreted ).
When it encounters a CALLI4 opcode, it checks the method index to int.
If the value is negative, then the call is system.
Called with parameters, the system call function pointer ( int (*systemCall)( int *parms ) ).
The function pointed to by systemCall is used to dispatch and route the system call to the required part of quake3.exe

The list of system calls provided by the client VM and server VM is at the beginning of each file.

Interesting fact: parameters are always very simple types, they are either primitive (char, int, float), or are pointers to primitive types (char *, int []). I suspect that this has been done to minimize the problems of struct communication between Visual Studio and LCC.

An interesting fact: Quake3 VM does not perform a dynamic connection, so the developer of the QVM mod did not have access to any libraries, even the standard C library (strlen, memset is here, but in fact are system calls). Some managed to emulate them with a predefined buffer: Malloc in QVM .

Unprecedented freedom

Thanks to the transfer of functions to a virtual machine, the modder community has gained much more opportunities. In Nela Toronto's Unlagged , the prediction system was rewritten using "reverse agreement ".

Performance problem and its solution

Because of such a long toolchain, VM code development was difficult:

The tulchain was slow.
Tulchain was not integrated into Visual Studio.
Building QVM required using command line tools. This hampered the development process.
Due to the large number of elements of the toolchain, it was difficult to find the parts responsible for the errors.

Therefore, idTech3 also had the ability to load native DLLs for VM parts, and this solved all the problems:

In general, the VM system was very flexible because the virtual machine has the ability to execute:

Interpreted bytecode
Bytecode compiled into x86 commands
Windows Compiled Code

Artificial Intelligence

The modders community has written bots for all previous idTech engines. At one time, two systems were quite famous:

For Quake1 was Omicron .
For Quake2 they wrote Gladiator.

But for idTech3, the bots system was a fundamental part of the gameplay, so it needed to be developed within the company and it had to be present in the game initially. But serious problems arose during the development:

Source : page 275 of the book “Masters of Doom”:

— . — , . , . Quake III, , . .

, . , , , . .

, , . , , . . 1999 , .

Architecture

As a result, Jean-Paul van Waverin (Mr.Elusive) worked on the bots, and it's funny, because he wrote Omicron and Gladiator. This explains why part of the server bots code is highlighted in a separate project bot.lib:

I could write about this, but Jean-Paul van Waveren himself wrote a
103-page work with a detailed explanation. Moreover, Alex J. Champandard created an overview of the bot system code , which describes the location of each module mentioned in the work of van Waverin. These two documents are sufficient for understanding Quake3 AI.

Source: https://habr.com/ru/post/330818/

All Articles

Quake III source code

Quake III Renderer

Architecture

Shaders

Multicore renderer and SMP (symmetric multiprocessing)

Network model

Architecture

Snapshot system

Introspection Memory on C

Pre-fragmentation

Messages with reliable and unreliable transmission

Recommended reading

Virtual machine

Architecture

QVM insides

LCC internals

Internals q3asm.exe

QVM: how it works

Unprecedented freedom

Performance problem and its solution

Recommended reading

Artificial Intelligence

Architecture

More articles: