📜 ⬆️ ⬇️

F1 2015: a new level of realism on the PC

F1 2015 is the latest game from the Formula One series. It was released by Codemasters, based on a special version of their EGO game engine. The game was created almost completely from scratch, the new engine has significantly improved the quality of the picture, and the capabilities of artificial intelligence.

EGO - the first development of Codemasters, aimed at the eighth generation of gaming consoles (PS4 and Xbox One) and the PC platform. The architecture of the new engine was designed to work with multi-core processors, which are used in the above-mentioned consoles. In addition, the company sought to create a scalable platform that would effectively use the resources of modern computers.


')
In November 2015, a fix pack was released, which added to the EGO an updated sound subsystem and better support for the particle system, in the processing of which the microprocessor is involved. As a result, the game was able to use the capabilities of the best gaming computers at full capacity.

In this article we will talk about how Codemasters improved their game engine, and what they managed to achieve.


F1 2015

Adapt to PC hardware


Modern microprocessors for computers are, in a first approximation, a wide range of clock frequencies and a different number of cores. One of the components of the success of the game is the ability to adapt to the various hardware characteristics of the systems.

Games developed in those days, when computers usually used processors with one or two cores, were designed for the availability of a limited number of threads. The same was true for games for early generations of game consoles.

The main streams were allocated to perform rendering and calculations associated with the game logic. To solve smaller problems using the remaining system resources. On a PC, with this design approach, preference was given to processors with very high single-threaded performance.

Nowadays, everything has changed. So, the eighth-generation consoles, both PS4 and Xbox One, are now equipped with eight-core processors. Each of them is a highly efficient computing unit with relatively low productivity. The game involves six to seven such cores. What remains is intended for the operating system.

With modern computers - the same story. They install processors with the number of cores from two to eight. At the same time, if the core, using technology like Intel Hyper-Threading (Intel HT), supports simultaneous multithreading technology, it looks to the OS as two logical processors.

RedBull 2016 car

This means that the PC-game can use up to 16 logical processors, some of the resources which go to the operating system. Therefore, the old approach is not applicable to new hardware, and if the code for the consoles can be optimized for a specific configuration, then the games for computers need to be able to adapt to the available resources.

The first tests on the PC


When the game's release date was nearing July 10, 2015, it became clear that the new engine effectively uses multi-core processors, and that its performance indicators seriously differ from the early F1 versions, which were designed for the previous generation of consoles. Here is a comparison of workloads on the CPU and GPU for F1 2014 (left), and F1 2015 (right), which run with maximum quality settings and no limited frame rate.


Analysis of F1 2014 (left) and F1 2015 (right) in GPUView

Images are made using the Microsoft GPUView (this tool is included in the Windows Performance Toolkit, as part of the Windows Platform SDK ). It shows how well work is distributed between the processor and the video core. GPUView, in addition, can be used to find problems with synchronization and insufficient loading of the video card. This can be seen at the top of the image, below it is shown in an enlarged view.


GPU load in F1 2014 and F1 2015

The purple line breaks in the upper left of the image indicate interruptions in the GPU in F1 2014. This means that the video card is idle and does nothing useful. At the same time, the performance of F1 2014, on the studied high-performance systems, does not rest on the capabilities of the GPU. At 1080p, the game was limited only by processor performance.


Car Ferrari 2016

The corresponding line on the right (blue) does not contain gaps. This means that in F1 2015 the GPU is always loaded. In the figure, which shows a complete picture of the system load, the lines below the yellow separator show the load on the processor threads. In F1 2014 (left), one thread is almost constantly loaded, it limits the performance. Take a closer look at this part of the image.


Load distribution on threads

The different colors used to color the stripes represent the logical processors of the system. Changing the color of the stream means that it is periodically executed on different logical processors within the same frame.

One more stream loaded with calculations can be seen in the lower part of the left image relating to F1 2014. This stream is processing graphics driver commands. At the same time, in the image on the right (for F1 2015), all flows have some idle time, that is, there is no noticeable relationship between the flows.

If you combine the analysis of CPU and GPU, you can see that F1 2014 was limited to the critical path through the engine, which prepares data for the graphics API, loading one stream. Thus, F1 2014 is a classic example of using an engine designed to run on game consoles of previous generations, when the load is optimized for good performance on fast dual-core processors, with some improvements in the transition to quad-core. But adding more cores does not give tangible benefits, since the limiting factor is the only main rendering stream.

The new engine, on the other hand, significantly reduces the processor overload associated with preparing data for a video card, taking full advantage of the Intel HT technology and evenly distributing the load between logical processors.


Analysis of F1 2015 in GPUView at 60 FPS

The company's programmers quickly discovered that computers were much faster than the consoles for which they performed most of the optimizations. Achieving the usual frame rate of 60 FPS (16 ms for computing is required) on the consoles required a lot of effort. And when tested on the 4th generation of Intel Core processors (for example, Intel Core i5-4670), it turned out that the processor performed all the necessary work in less time than is required to create a frame. In many cases - faster than 10 ms.

The above figure shows an analysis of the game in GPUView with a screen refresh rate limit set to 60 Hz. Faster processors, such as the 6th generation Intel Core i7-6900K, make the imbalance even more noticeable.

This is how the analysis of the game looks like with an unlocked frame rate in a system with an installed Intel Core i7-5960x 5th generation and an NVIDIA GTX980 video card.


A game with an unlocked frame rate running on an Intel Core i7-5960x with 8 cores (16 threads) in GPUView

In comparison with the quad-core processor (the very first figure, which compares F1 2014 and 2015), in a system based on Intel Core i7-5960x with 16 logical processors, the CPU stands idle a considerable part of the time needed to prepare the frame.
All this means that the new engine benefits not only from an increase in single-threaded performance, but, most importantly, it can benefit from additional processor cores.

The result shows that the Intel Core i7-5960x with six cores involved can bypass the quad-core i7-6700k, whose clock frequency is higher.

These initial tests, which showed that the GPU was already fully loaded, changed the optimization priorities on the PC. Improvements related to the video card continued, but instead of spending developers' resources on optimizing the rendering subsystem on the CPU (such as transferring part of the video core work), the studio began to explore other ways to optimize processor usage to improve the user experience of the game and improve its performance. realistic

Improved realism


The developers had a difficult and interesting task. It was to improve the realism of the game, make it more attractive without affecting the gameplay (due to the requirements for network multiplayer mode) and without adding much more work from the GPU. In this regard, from the changes in artificial intelligence and improve the accuracy of game physics immediately refused, because such improvements should be achievable on the hardware of any PC that meets the minimum requirements of the game. It is necessary in order not to put in an unequal position the players using different computers in multiplayer mode.


Mercedes-Benz F1 W07 Hybrid car, presented in 2016

Even in single-user mode, any changes in the behavior of cars can be difficult, since they require very careful rebalancing of the machine control system. Instead, Codemasters focused on two ways to increase realism. In particular, these are improvements in the field of sounding the game using the updated sound subsystem and an increase in the number of dynamic visual design elements due to the updated particle system. These subsystems were chosen because they were already based on the CPU and were previously limited by the capabilities of the consoles. The PC architecture gave developers the opportunity to improve the immersive effect using techniques that were initially abandoned due to hardware limitations.

Sound playing


Improving the gaming sound was the first direction of optimization. This approach, on the one hand, can be scaled depending on the available hardware, on the other hand, it allows to improve the gaming experience without affecting the gameplay.
Sound support in F1 2015 is implemented using an intermediate subsystem, which creates its own stream for mixing sound by means of the processor.

Earlier in Codemasters found that if this thread is overloaded, or its execution will be delayed, it will result in noticeable interruptions in sound. To prevent interruptions on consoles, a separate core was allocated to the “sound” stream. This allowed us to guarantee uninterrupted sound processing.

The computer has a separate logical processor for the sound subsystem. Other game tasks use the remaining resources. On consoles for such threads, a binding to the processor core is used, and on computers, the SetThreadIdealProcessor () command is used, which helps the operating system in assigning priorities.

Even if a separate core or logical processor is allocated to the stream, it is important to mix at a sufficient speed. As a result, the maximum number of sound sources was limited to the amount that can be processed in the worst-case scenario, for example, in an accident. Initially, the limitation on the sources of sound was established as follows: 5 cars, plus a player’s car.

With a much more powerful processor, with a potentially large number of processor cores, to handle mixing tasks, the operating system is unlikely to attempt to schedule the execution of some additional tasks on the logical processor assigned to the sound processing flow. Accordingly, it was possible to add support for high-quality sound to the PC version of the game. Namely, we are talking about the following improvements.


Particle system and weather simulation


Another improvement in the game was the updated particle system. The previous version of this system was limited by the available resources of the CPU and GPU game consoles. The visual component was built on the basis of these limitations. The particle system was already based on the CPU. This is what Andrew Wright, a graphics programmer from Codemasters, tells about.

"We assumed (and, I must say, correctly) that we will be more limited by GPU resources than CPU, so the development of a particle system has always been approached from the point of view of its efficient parallel processing on a CPU."

This means that a team of programmers started working on system improvements, which can be scaled depending on the amount of available processor resources. Moreover, they could do this so that the improvements, at the same time, do not necessarily lead to an increase in the load on the video core. The fact that the support of the particle system is assigned to the processor has other advantages, which Andrew talked about:
“The system based on the CPU is very flexible. It handles collision trails for individual types of particles specifically designed for this — basically something like gravel and grass. This also works for particles that are not visible, that is, a rapid movement of the camera will show previously invisible particles in some intermediate phase of motion. Collisions can cause sound effects. This part of the work is very difficult to implement on the GPU. "

The first part of the task was to increase the number of particles created with small and almost imperceptible changes in their graphic design. In particular, the particle size was reduced, and their density per unit of space increased. These changes give the same visible effect at some distance, but near the picture becomes much more detailed. This was done for various effects created by the tire's interaction with the road (both on the asphalt road and beyond). Here is what it looks like.


Improved effect on gravel outside the track

In a similar way, the effect of smoking tires was improved. Here, particles of relatively large sizes were replaced by much smaller ones, which made it possible to better convey the shape of a cloud of smoke. This made it possible to improve the volumetric illumination used for smoke, since smaller particles allow for a better mathematical representation of how light passes through smoke-filled space. Here is what it looks like.


Improved tire look

Although the particle system improvements described above are quite significant, they are visible only in short periods, for example, when a player or artificial intelligence loses control of the car and it flies off the track.

The developers quickly realized that the updated particle system could be used to significantly improve the simulation of weather conditions, namely rain. The weather does not depend on the player’s ability to keep the track, therefore, improved visual effects can be observed not only at special moments of the game.

One of the major improvements of the game, on which Codemasters places special emphasis in promotional materials, is the improvement of car management. In particular, the behavior of cars on a wet road is significantly improved. Driving in the rain is an important part of real Formula 1 racing. Many of the tracks are known for the extreme weather conditions that influence the race. The table shows the probability of precipitation during the race in the Champion and Pro game modes. On average, when carrying out 34% of game races it rains, at least some of the races lasting up to four hours.

Chance of precipitation
Australia (Melbourne)
Malaysia (Kuala Lumpur)
Bahrain (Sahir)
China (Shanghai)
Spain (Catalonia)
Monaco (Monte Carlo)
Canada (Montreal)
Japan (Suzuka)
Russia (Sochi)
United States (Austin)
26.00%
55.00%
0.00%
48.00%
36.00%
38.00%
53.00%
33.00%
34.00%
24.00%
Austria (Red Bull Ring)
United Kingdom (Silverstone)
Germany (Hockenheim)
Hungary (Budapest)
Belgium (Spa-Francorchamps)
Italy (Monza)
Singapore
Brazil (Sao Paulo)
Abu Dhabi (Yas Marina)
Mexico
43.00%
31.00%
37.00%
39.00%
47.00%
26.00%
39.00%
40.00%
3.00%
28.00%

The transition to smaller particles, more closely spaced in space, means that the behavior of water droplets can be modeled with much higher accuracy. The particle system processes nonlinear changes in particle characteristics during their existence for their properties such as color, transparency, violation of surface smoothness, angular resistance, linear resistance, and the effect of gravity. All this, moreover, is consistent with the wind simulation system, which is used to simulate rain.

Car MP4-31 McLaren 2016

In the drawings below, you can see debug images showing the behavior of small splashes of water from the wheels of the car and their interaction with the air that passes through the casing. The resulting effect is to create a spiral vortex emanating from the rear of the machine.


A twist debugging image showing the motion of water splashes.


Twirl debugging image, back view

The use of smaller particles also means that the existing lighting model works much better with water droplets. Light from sources of illumination, for example, from some parts of cars, can be modeled much more accurately on small particles than on large ones, representing large volumes of droplets. The picture below shows the lighting of traces of water spray.


Lighting improvement

Another area in which improvements have been made is the interaction of the car with water on the road surface. This and water whirlwinds behind the car, which capture water from the ground, and splashes from the wheels on the road flooded with rain. All this allows to improve the visual connection of the car with the road, to achieve the same effect, which is achieved thanks to the shadows on a sunny day.

Here's what a wet ride looks like before and after upgrades.


Water interaction on the road

But the drops of water that form the trail behind a passing car in the game (left) and when debugging in the Intel INDE Graphics Performance Analyzer for DirectX .


Trail of water droplets

The latest improvement was an improved rain simulation. Initially, a simple algorithm was used in the game, using video card resources, which output several thousand particles of rain in each frame, applying gravity to each drop, which caused them to fall. In the new rain simulation, most of the updated subsystem has moved to the processor, which made it possible to interact with the airflow data used in other parts of the game. The number of raindrops was increased tenfold, and the individual droplets were reduced so that the pixel coverage was about the same. The figure below shows the debug images of the rain taken from the same position. On the bottom image - 217 thousand drops, on the top, representing the previous version of the game, only 21 thousand. Despite the increasing number of primitives, the actual number of pixels affected by the improvement increased slightly from 72 thousand to 119.


Debug rain simulation

By increasing the number of raindrops and mist, the developers set up the transparency used in the particle system so that the resulting misting effect, visible from a distance, turns out to be the same as it was before. This is necessary so that the graphical settings do not change the gameplay.

CPU load balancing in new conditions


Usually, the video card was the limiting factor in PC performance for F1 2015. The load was especially high on low-cost devices. The most important part of developing and implementing new effects was ensuring that the extra particles did not increase the load on the GPU. The second important factor in computing resources was the achievement of a good distribution of the computations running on the processor. A proper load balancing between the available CPU resources was needed, and it was impossible to prevent the load from increasing in the sequence of engine operations responsible for rendering.

Control over the load on the GPU is achieved through two approaches.

The first is that vertex processing has been moved from the vertex shader to the CPU. This reduced the amount of work done on the vertices on the video card. This approach did not lead to a complete removal of the load from the video core, but it was reduced by half. Thus, the processing of ten times more particles meant a fivefold increase in the cost of conducting operations with vertices.

The second significant change was to reduce the cost of filling the frame, for example, when rendering rain. The number of drops increased tenfold, but the number of pixels it affected increased only 1.65 times. In the case of the turbulence of the water dust, which remain behind the machine, the changes were even more significant. Thus, the increase in the number of vertices from 3 thousand to 70, in fact, led to a decrease in the number of processed pixels from 5,800 thousand to 2,500 thousand, that is, the pixel fill rate required to implement this effect was halved. As a result, the effect, which is used to create 20 times more particles, did not require a serious increase in the load on the video card when displaying the image.

The load balancing on the CPU was done by distributing the particle-related calculations between as many logical processors as possible. The figure below shows the load distribution on systems with four and six cores (in both the first and second cases with Intel HT technology enabled). The purple and red blocks show particle processing systems and weather simulations during a very heavy rain on a six-core processor (12 logic processors). On such a processor, 6 of 12 processors are engaged in calculations for a particle system. On the quad core, respectively, five of the eight are involved, but their resources have to be shared with other tasks in the engine.


CPU load balancing for a 6-core system (left) and a 4-core system (right)

The game engine involved in F1 2015 uses a task-based system for load sharing. At the same time there are presets for 2, 4, 8, 12 and 16 logical processors. The system tries to schedule tasks so as to reduce any dependencies and not overload one of the processors.

How has the look of the game


The figure below shows a comparison of the original (left) and significantly improved (right) particle system. Pay attention that the water dust in the second case is more clearly tied to the car that raises it into the air.


Water whirls that cause cars

Improvements in the treatment of sprays that cars raise from the road, losing traction with it, are shown in the figure below. At the same time, individual drops are noticeable on the background of the F1 2015 logo at high quality settings. In addition, improvements in tire interaction with the track can be seen.


Skid

Performance scaling


F1 2015 has a built-in performance test that uses the current graphics settings and can be configured to run in various weather conditions. The figure shows the data obtained when running the embedded test in a system with an Intel Core i7-5960x, with a frequency fixed at 3.0 GHz (this is its base frequency), and an NVIDIA TitanX video card. There you can see a list of used settings. The data shown here is the average number of frames per second obtained after the test was fully completed.


Performance scaling on multi-core systems

The number of physical processor cores was configured via BIOS. The remaining parameters of the system remained unchanged. Tests were conducted with and without Intel HT technology.


Test settings

On a system in which only 2 cores were turned on, with Intel HT technology turned off, the game did not start, since such a configuration does not meet the minimum system requirements. In general, ceteris paribus, the inclusion of Intel HT increased performance in much the same way as with the addition of two additional cores. A comparison of performance in a configuration with two cores and Intel HT enabled, and in a configuration with Intel HT and four cores, gave an increase in frame rate by 79%. The transition to the 6-core configuration added another 17%. Using the same 8-core configuration with Intel HT turned on gave a performance increase of 27% compared to the 4-core configuration and Intel HT. Here it is necessary to take into account the fact that the obtained results cannot be considered as some absolute values. We can only argue that they reflect the results of testing on a specific system.

, , – , , , , , .

20 , . 4- . . , Intel Core i7-5960x , 4- , , .

findings


F1 2015 – , , . CPU GPU . , , Codemasters F1 2015. , – . .

Intel HT , Codemasters . .

Source: https://habr.com/ru/post/283244/


All Articles