Have you ever had to play a game like StarCraft or Supreme Commander and get an error message like “Out of sync”, followed by closing the game? Want to know why this is happening? This is a legacy of game engine architecture, often used by real-time strategies. one
My experience in this area comes from working with the Supreme Commander engine in the studio Gas Powered Games. During beta testing in Starcraft and Warcraft 3 there were also problems with synchronization, so we can say that in general they work the same way. For simplicity, I will speak about the Supreme Commander engine. Finding similarities with other games will leave as an exercise for the reader :)
Requirements
First, what are the requirements for our game? To let you know, here is the video for the first part of Supreme Commander (2006). ')
The game must support 8 players in a multiplayer game over the Internet, with hundreds of units in each army. These are several thousand units in one game. Yoshkin cat. A typical client-server approach for shooters is clearly not suitable here. With such a number of units, it will require much more bandwidth than is available to most players.
So how can you approach the task? ..
Synchronous engine architecture
... With a fully synchronous architecture with a simultaneous step! In a synchronous engine each client executes the same code at the same rate. Think it over a little. In the game Supreme Commander for 8 players, each player keeps the same game state and executes the same code. Instead of transmitting information about the state of units (position, health, etc.), only commands entered by players can be transmitted over the network 2 . If all players have the same game state and the same input is processed, the resulting state must also be the same.
By the same principle, run replays of games, including in shooters, operate. Did you ever wonder why the size of the replay files is so small? This is because such a file should only store user input. Then we simply start the game, feed it the input from the replay file and get the same result as in the original game. That is why the replay files often stop working when the game is updated, and that is why they can often not be rewound 4 . By the way, for the same reason, many strategies do not support connecting new players as the game progresses — in order to connect a new player, the full game state must be transferred to it. For a game with three thousand units it will take too long.
Levels
Watch the video at the beginning, if you have not had time. How do you think, with how many frames per second the game goes? The correct answer is 10 fps. “Wait, what? She looks much more smoothly! ”, You say! Yes - and not at the same time. Generally speaking, the game uses two frequencies simultaneously.
The SupCom engine uses two levels - simulations and user. The simulation level operates at a fixed frequency of 10 frames per second. That it can be considered a "real game." All units, all AI and all physics are updated in the SimTick function, which runs 10 times per second. Each SimTick should work in less than 100 ms, otherwise the game will go in slow motion. In a network game, if a player does not have time to fully process SimTick in 100 ms, all other players are forced to stop and wait for the laggard.
The user level runs at full frame rate. This level can only be considered graphic. The user interface, rendering, animation, and even the position of units can be calculated at 60 fps. Each UserTick is updated during delta, which is used to interpolate the state of the game to an intermediate value (for example, intermediate unit positions). That is why the game can look and play smoothly, although the main core of the engine runs at a fairly low frequency.
Determinism
“Wait a minute!”, The intelligent reader exclaims. “If every player independently updates the state of the game, should that mean that the simulation of the game is completely deterministic?” So it is. Is it difficult to achieve this? Yes. Especially in the modern multi-threaded world.
A lot of trouble to the developers of the engine deliver floating-point numbers. Instead of describing this topic in detail, I’ll provide a link to Glenn Fielder’s fantastic post - Floating Point Determinism .
In the comments to him, Elijah discusses Supreme Commander. If you force the processor to strictly follow the IEEE754 standard, this solves most problems. But such a decision means a decrease in performance and the game cannot perform calculations with an unspecified result (however, this should not be done anyway).
Internal delays
The synchronous multiplayer game has certain disadvantages. In addition to the difficulty of creating a huge, fully deterministic simulator, there is a delay in input processing. I already wrote how each user in a multiplayer game updates the same game state using the same input. This means that any new input will be processed only when all customers agree to which simulation step to process it!
For example, three players — A, B, and C — run SimTick [1]. At this time, Player A commands the unit to attack. UI immediately shows the response, because UserTick runs 60 times per second. In a single player game, this command will be processed in SimTick [2] (delay 0-100 ms). However, all three players must process the team in the same SimTick run in order to get the same result. Instead of trying to process a command in SimTick [2], player A sends network packets to players B and C with data for execution in SimTick [4] (delay 200-300 ms). This gives all players time to get a team. The game may fail if the input information is not received or not confirmed in time. I do not know what kind of mechanism was used for this in SupCom, but I will update this post if I find out. The specific number of SimTick executions through which the input must be processed is determined dynamically using the p2p 5 topology.
The delay from the user's click to the reaction of the unit will always be at least 0-100 ms (the next SimTick). This can be disguised in several ways. The interface usually responds immediately - something flashes, the corresponding sound is heard: “Life for Ayur!” Or “Zug Zug”.
In a single-player game, this is normal, but in a multiplayer battle, the delay starts to become noticeable, reaching several hundred milliseconds. I always wanted to experiment with instant response animation in UserTick. For example, if you give a command to move, the user level starts moving the unit slowly and “mixes up” the movement in the direction of the point indicated by the simulation when the command is actually executed. This can be very useful in more jerky games like DOTA or Demigod. There are really certain extreme cases that will have to be handled separately, which is why I did not really take up the implementation. If someone did this, post in the comments. :)
Out of sync - Bugs from Hell
One of the most difficult bugs in the Universe is the out of sync bugs. These are pretty wicked bastards. The main assumption of the engine is that all players are fully synchronized. What if it's wrong? What happens if the simulations of different players diverge? Chaos. Anger. Suffering 6 .
In SupCom, the entire game state is hashed once a second. If some clients have hashes do not match - the concert is over. Game over. The end. A window pops up with the message “Sync Error” and you have to leave the game. Something in the SimTick results did not coincide, and now the game states are different. Simulation paths diverged and will only get worse. The recovery mechanism is not provided here.
Unsynchronization is usually the result of a programmer's error. Out of sync can be played in 5% of games that last more than 60 minutes. Finding and correcting such an error usually includes a binary search in the memory state hashes that are printed to the console during the game. With out-of-sync with a low chance of replaying, this leads to a huge number of messages in the console while half a dozen machines calculate the simulation as quickly as possible, waiting for it to break. If this is not enough for you, I will add that one of the most frequent causes of desynchronization is an undeclared variable.
History of the Demigod
Most of my work on the SupCom engine was done while working on Demigod, which used a modified version of the engine.
At the very end of development for a long time there was a rarely recurring out-of-sync bug that was assigned to me. In Demigod, a crowd of small cannon fodder ran around in a map. And in very rare cases, the positions of individual lemming on different machines differed by several centimeters. It sounds harmless, but in fact it was the flap of a butterfly's wings, from which the hurricane of problems begins.
I remember exactly that I was not sure if I could fix this bug, and our lead programmer said, “I know you can fix it. I believe in you. ”No pressure, right? Every morning we had a ten-minute meeting, and every day my report was short and simple: “hunting for out of sync”. After almost a week of descent into the abysses of madness, I found the cause of the error. If you watched the trailer, you saw the ability of some heroes to throw opponents into the air. When a huge walking castle lowers its hammer, units fly around. The bug was in one of the pointers in the search engine, which pointed to nowhere, because of which, after landing, the unit simply disappeared.
But to reproduce the desync this was not enough. To begin with, the unit had to be killed by one of several special skills. This removed it from the pathfinding system and left a dangling pointer. The memory manager moved the memory of the remote component to a linked list of free areas without changing its contents. Then, before the unit landed, a new memory allocation was to occur. This selection should have affected the block that was recently removed. Only after this there was a desync. Setting the pointer to null solved the problem.
Final thoughts
It was a very short review of the synchronous engine used by Supreme Commander. I know that many of the old games were about the same. The latest generation certainly applies some new tricks, especially to combat the input delay. I know that there is also an out of sync in StarCraft 2, so it’s likely to be similar. Other games you can watch are Heroes of Newarth or League of Legends. They are not as complex as SupCom, and are played fairly smoothly, but I did not understand them and I do not know what methods they use for this.
Halo uses a synchronous model with a simultaneous step in Campaign Co-op and Firefight modes
In SupCom, input is processed as commands to groups of units. Commands of movement, attack, defense, use of abilities, etc.
Old duplicate files can be maintained if it is possible to run old code with old data.
Rewind in Halo was done using “save points” that store the state of the game for a certain moment. It was impossible to rewind smoothly back, but you could go to the previous save point and move forward from there. EMNIP.
SupCom makes full use of the peer-to-peer network architecture