Creating World of Tanks Blitz based on your own engine DAVA

Prologue

This story began more than three years ago. Our small company DAVA became part of Wargaming, and we began to think about what projects to do next. To remind you how mobile it was three years ago, I’ll say that at that time there was neither Clash Of Clans, Puzzle & Dragons, nor many projects that are very well-known today. Mid-core then just started. The market was several times smaller than today.

Initially, it seemed to everyone that it would be a very good idea to make several small games that would attract new users to large “tanks”. After a series of experiments, it turned out that this does not work. Despite excellent conversions in mobile applications, the transition from mobile phone to PC turned out to be an abyss for users.
')
Then we had a few games in development. One of them bore the working name "Sniper". The main gameplay-idea was shooting in a sniper mode from a tank standing in defense, at other tanks that were controlled by AI and which could attack in response.

At some point, it seemed to us that standing tank was very boring, and in a week we made a multiplayer prototype, where tanks could already drive and attack each other.

Since this all started!

When we started developing the Sniper, we looked at technologies that were then available for mobile platforms. At that time, Unity was still at a fairly early stage of its development: in fact, we did not have the technology needed.

The main thing that we lacked was the rendering of the landscape with dynamic detailing, which is vital for creating a game with open spaces. There were several third-party libraries for Unity, but their quality left something to be desired.

We also understood that in C # we will not be able to squeeze the most out of the devices we are developing for, and we will always be limited.
Unreal Engine 3 also did not fit for a number of similar reasons.

As a result, we decided to refine our engine!

He was at that time already used in our previous casual projects. The engine had a fairly well written low level of work with platforms and supported iOS, PC, Mac, plus work began on Android. Much functionality has been written to create 2D games. That is, there was a good UI and a lot of things to work with 2D. It was the first steps in the 3D part, since one of our games was completely three-dimensional.

What we had in the 3D part of the engine:

The simplest scene graph.
The ability to draw static meshes.
The ability to draw animated meshes with skeletal animation.
Exporting objects and animations from the Collada format.

In general, if we talk about the functionality of a serious modern engine, there was very little in it.

Beginning of work

It all started with the proof of the possibility to draw the landscape on mobile devices: then it was the iPhone 4 and iPad 1.

After several days of work, we got quite a functional dynamic landscape, which worked quite tolerably, required about 8MB of memory and gave 60fps on these devices. After that, we began a full development of the game.

About half a year has passed, and the small mini-project has turned into what Blitz is now. There were completely new requirements: MMO, AAA-quality and other requirements that the engine in its original form at that time could not provide. But work was in full swing. The game worked and worked well. However, the performance was average, there were few objects on the maps, and, in fact, there were many other limitations.

At this stage, we began to understand that the foundation that we laid in the engine will not sustain the press of a real project.

How it all worked at that time

All scene rendering was based on the simple concept of Scene Graph.

The main concepts were two classes:

Scene - the container of the scene, inside which all the actions took place.
above the stage.
SceneNode is the base class of the scene node, from which all classes that were in the scene inherit:
MeshInstanceNode - class for drawing meshes.
LodNode is a class for switching LODs.
SwitchNode - class for switching switch objects.
about 15 more classes of SceneNode heirs.

The SceneNode class allowed redefining a set of virtual methods to implement some kind of custom functionality:
The main functions that could be overridden are:

Update - the function that was called for each node in order to make Update-scenes.
Draw is a function that was called for each node in order to draw this node.

The main problems we encountered.

First, the performance:

When the number of nodes in the level reached 5000, it turned out that it would take about 3ms to simply go through all the empty Update functions.
Similar time was spent on empty nodes that did not require Draw.
A huge number of cache-misses, since the work was always done with different types of data.
The inability to parallelize work on multiple cores.

Secondly, unpredictability:

Changing the code in the base classes influenced the entire system, that is, every SceneNode :: Update change could break anything, anywhere. Dependencies became harder and harder, and every change inside the engine almost guaranteed to require testing all the related functionality.
It was impossible to make a local change, for example, in transformations, so as not to hurt the rest of the scene. Very often, the slightest changes in LodNode (node for switching Lods) broke something in the game.

First steps to improve the situation

To begin, we decided to treat performance problems and do it quickly.

Actually, we did this by entering an additional NEED_UPDATE flag in each node. It determined whether such a node should call Update. This really improved performance, but created a whole bunch of problems. In fact, the Update function code looked like this:

void SceneNode::Update(float timeElapsed) { if (!(flags & NEED_UPDATE))return; // the rest of the update function // process children }

This returned some of the performance to us, but many logical problems started where they were not expected.

LodNode, and SwitchNode — nodes that are responsible for switching Lods (by distance) and switching objects (for example, destroyed and intact), respectively — began to break down regularly.

Periodically, the one who tried to fix the breakdowns did the following: disconnected NEED_UPDATE in the base class (after all, it was a simple solution), and completely imperceptibly the FPS fell again.

When the code checking the NEED_UPDATE flag was commented out three times, we decided to make radical changes. We understood that we could not do everything at once, so we decided to act in stages.

The very first step was to lay the architecture, which will allow in the future to solve all the problems we have.

Goals

Minimizing dependencies between independent subsystems.
Changes in transformations should not break the system of Lods, and vice versa.
Ability to put code on multi-core.
That there were no functions Update or similar, in which the heterogeneous independent code was executed. Easy extensibility of the system with new functionality without full retesting of the old one. Changes in some subsystems do not affect others. Maximum independence of subsystems.
The ability to arrange data linearly in memory for maximum performance.

The main goal at the first stage was to redesign the architecture so that all these goals could be fulfilled.

Combining the component and data-driven approach

The solution to this problem was a component approach, combined with a data-driven approach. Further on, I will use the data-driven approach because I have not found a successful translation.

In general, the understanding of the component approach for many people is very different. Same with data-driven.

In my understanding, the component approach is when some necessary functionality is built on the basis of independent components. The simplest example is electronics. There are chips, each chip has inputs and outputs. If the chips fit together, they can be connected. Based on this approach, the entire electronics industry has been built. There are thousands of different components: connecting them together, you can get completely different things.

The main advantages of this approach are that each component is isolated, and with more independent. I do not take into account the fact that the component can submit incorrect data, and the board will burn. The advantages of this approach are obvious. Today you can take a huge number of ready-made chips and assemble a new device.

What is data-driven ? In my understanding, this is an approach to software design, when data, rather than logic, is taken as the basis for the program flow.

In our example, we represent the following class hierarchy:

 class SceneNode { //      Matrix4 localTransform; Matrix4 worldTransform; virtual void Update(); virtual void Draw(); Vector<SceneNode*> children; } class LodNode { //  c    LodDistance lods[4]; virtual void Update(); //   Update,       ,    -    virtual void Draw(); //      }; class MeshNode { RenderMesh * mesh; virtual void Draw(); //   };

The bypass code for this hierarchy hierarchically looks like this:

 Main Loop: rootNode->Update(); rootNode->Draw();

In this hierarchy of C ++ inheritance, we have three different independent data streams:

Transformations
Lody
Meshes

Nodes only unite them in a hierarchy, but it is important to understand that it is better to process each data stream sequentially. The practical need for processing across the hierarchy is only needed for transformations.

Let's imagine how this should look like in a data-driven approach. I will write in pseudocode so that the idea is clear:

 // Transform Data Loop: for (each localTransform in localTransformArray) { worldTransform = parent->worldTransform * localTransform; } // Lod Data Loop: for (each lod in lodArray) { // calculate lod distance and find nearest lod nearestRenderObject = GetNearestRenderObject(lod); renderObjectIndex = GetLodObjectRenderObjectIndex(lod); renderObjectArray[renderObjectIndex] = renderObject; } // Mesh Render Data Loop: for (each renderObject in renderObjectArray) { RenderMesh(renderObject); }

In fact, we expanded the program's work cycles, having done this in such a way that everything was repelled by the data.

Data in a data-driven approach is a key element of the program. Logic is only data processing mechanisms.

New architecture

At some point, it became clear that we had to go towards the Entity-based approach to organizing the scene, where Entity was an entity consisting of many independent components. I wanted the components to be completely arbitrary and easily combined with each other.

Reading the information on this topic, I came across the blog T-Machine .

He gave me many answers to my questions, but the main answer was the following:

• Entity contains no logic, it is just an ID (or pointer).
• Entity knows only the ID components that belong to it (or the pointer).
• A component is only data, that is. component does not contain any logic.
• A system is a code that can process a specific data set and output another data set at the output.

When I realized this, in the process of further studying various information I came across the Artemis Framework and saw a good implementation of this approach.
Sources here if the previous link does not work: Artemis Original Java Source Code

If you are developing in Java, I highly recommend looking at it. Very simple and conceptually correct Framework. Today it is sported in a bunch of languages.

What Artemis is is called ECS (Entity Component System) today. There are quite a few options for organizing scenes based on Entity, components and data-driven, but we finally came to the ECS architecture. It is difficult to say how common the term is, but ECS means that there are the following entities: Entity, Component, System.

The most important difference from other approaches is: The obligatory absence of logic of behavior in the components, and the separation of code in the systems.

This point is very important in the “Orthodox” component approach. If you break the first principle, there will be a lot of temptations. One of the first is to make component inheritance.

Despite the flexibility, it usually ends in pasta.

Initially, it seems that with this approach it will be possible to make a lot of components that behave in a similar way, but a little differently. Common component interfaces. In general, you can again fall into the trap of inheritance. Yes, it will be slightly better than the classical inheritance, but try not to fall into this trap.

ECS is a cleaner approach, and solves more problems.

To see an example of how this works in Artemis, you can look here .

I will show you with an example of how this works for us.

The main container class is Entity. This is the class that contains the array of components.

The second class is Component. In our case, this is just data.

Here is a list of components used in our engine today:

  enum eType { TRANSFORM_COMPONENT = 0, RENDER_COMPONENT, LOD_COMPONENT, DEBUG_RENDER_COMPONENT, SWITCH_COMPONENT, CAMERA_COMPONENT, LIGHT_COMPONENT, PARTICLE_EFFECT_COMPONENT, BULLET_COMPONENT, UPDATABLE_COMPONENT, ANIMATION_COMPONENT, COLLISION_COMPONENT, // multiple instances PHYSICS_COMPONENT, ACTION_COMPONENT, // actions, something simplier than scripts that can influence logic, can be multiple SCRIPT_COMPONENT, // multiple instances, not now, it will happen much later. USER_COMPONENT, SOUND_COMPONENT, CUSTOM_PROPERTIES_COMPONENT, STATIC_OCCLUSION_COMPONENT, STATIC_OCCLUSION_DATA_COMPONENT, QUALITY_SETTINGS_COMPONENT, // type as fastname for detecting type of model SPEEDTREE_COMPONENT, WIND_COMPONENT, WAVE_COMPONENT, SKELETON_COMPONENT, //debug components - note that everything below won't be serialized DEBUG_COMPONENTS, STATIC_OCCLUSION_DEBUG_DRAW_COMPONENT, COMPONENT_COUNT };

The third class is the SceneSystem:

  /** \brief This function is called when any entity registered to scene. It sorts out is entity has all necessary components and we need to call AddEntity. \param[in] entity entity we've just added */ virtual void RegisterEntity(Entity * entity); /** \brief This function is called when any entity unregistered from scene. It sorts out is entity has all necessary components and we need to call RemoveEntity. \param[in] entity entity we've just removed */ virtual void UnregisterEntity(Entity * entity);

The functions RegisterEntity, UnregisterEntity are called for all systems in the scene when we add or remove Entity from the scene.

  /** \brief This function is called when any component is registered to scene. It sorts out is entity has all necessary components and we need to call AddEntity. \param[in] entity entity we added component to. \param[in] component component we've just added to entity. */ virtual void RegisterComponent(Entity * entity, Component * component); /** \brief This function is called when any component is unregistered from scene. It sorts out is entity has all necessary components and we need to call RemoveEntity. \param[in] entity entity we removed component from. \param[in] component component we've just removed from entity. */ virtual void UnregisterComponent(Entity * entity, Component * component);

The functions RegisterComponent, UnregisterComponent are called for all systems in the scene, then when we add or remove the Component in the Entity in the scene.
Also for convenience, there are two more functions:

  /** \brief This function is called only when entity has all required components. \param[in] entity entity we want to add. */ virtual void AddEntity(Entity * entity); /** \brief This function is called only when entity had all required components, and don't have them anymore. \param[in] entity entity we want to remove. */ virtual void RemoveEntity(Entity * entity);

These functions are called when an ordered set of components has already been created using the SetRequiredComponents function.

For example, we can order to get only those Entities that have ACTION_COMPONENT and SOUND_COMPONENT. I transfer it in SetRequiredComponents and - voila.

In order to understand how this works, I will sign off with examples of what systems we have:

TransformSystem - a system that is responsible for the hierarchy of transformations.
SwitchSystem is a system that is responsible for switching objects that may be in several states, such as destroyed and undestroyed.
LodSystem is a system that is responsible for switching lods by distance.
ParticleEffectSystem - a system that updates the effects of particles.
RenderUpdateSystem is a system that updates render objects from the scene graph.
LightUpdateSystem is a system that updates light sources from a scene graph.
ActionUpdateSystem is a system that updates actions.
SoundUpdateSystem is a system that updates sounds, their position and orientation.
UpdateSystem - the system that causes custom user updates.
StaticOcclusionSystem - a system for applying static occlusion.
StaticOcclusionBuildSystem - a system for building a static occluded.
SpeedTreeUpdateSystem - Speed Tree update system.
WindSystem - a system for calculating wind.
WaveSystem - a system for calculating oscillations from vzirvov.
FolliageSystem - a system for calculating vegetation over the landscape.

The most important result that we have achieved is a high decomposition of the code responsible for dissimilar things. Now in the TransformSystem :: Process function, all the code that concerns transformations is clearly localized. He is very simple. It is easy to decompose into several cores. And most importantly, it is difficult to break something in another system by making a logical change in the system of transformations.

In almost any system, the code looks like this:

 for (  ) { //    //      //     }

Systems can be classified according to how they process objects:

The processing of all objects that are in the system is required:
- Physics
- Collisions
Only the handling of tagged objects is required:
- Transformation system
- System of actions
- Sound processing system
- Particle handling system
Work with your specially optimized data structure:
- Static Occlusion System

With this approach, in addition to the fact that it is very easy to process objects into several kernels, it is very easy to do what is quite difficult to do in the usual polymorphism-paradigm. For example, you can easily take and process not all lod-switches per frame. If there are a lot of objects in a large open world, you can make each frame process one third of the objects. However, this does not affect other systems.

Total

We greatly increased the FPS, as with the component approach things became more independent and we were able to untie and optimize them separately.
The architecture has become more simple and understandable.
It became easy to expand the engine, almost without breaking neighboring systems.
There are fewer bugs from the series “having done something with LODs, they broke the switches”, and vice versa
Now it is possible to parallelize everything into several cores.
At the moment, we are already working to run all systems on all available cores.

The code of our engine is in Open Source. The engine in the form in which it is used in World of Tanks Blitz, is fully available online on github .

Accordingly, if there is a desire, you can go and look at our implementation in detail.

Consider the fact that everything was written in a real project, and, of course, this is not an academic implementation.

Future plans:

More efficient management of these components, that is, decomposing these components linearly in memory, to minimize cache missions
Transition to multitasking in all systems.

All useful links from the text last:

Source: https://habr.com/ru/post/245321/

All Articles