PLO is dead, long live PLO

sources of inspiration

This post came about thanks to a recent publication by Aras Prantskevichius on a report intended for junior programmers. It tells you how to adapt to new ECS architectures. Aras follows the usual pattern ( explained below ): shows examples of a terrible OOP code, and then demonstrates that the relational model is an excellent alternative solution ( but calls it "ECS", not a relational one ). I do not in any way criticize Aras - I am a big fan of his work and praise him for his excellent presentation! I chose his presentation instead of hundreds of other posts about ECS from the Internet because he put additional effort and published a git-repository to study in parallel with the presentation. It contains a small simple “game” used as an example of a choice of different architectural solutions. This small project allowed me to demonstrate my comments on specific material, so thank you, Aras!

Slides of Aras are posted here: http://aras-p.info/texts/files/2018Academy - ECS-DoD.pdf , and the code is on github: https://github.com/aras-p/dod-playground .

I will not (for now?) Analyze the resulting ECS architecture from this report, but focus on the “bad OOP” code (similar to the “stuffed trick” trick) from its beginning. I will show how it would look in reality if they correctly corrected all violations of the principles of OOD (object-oriented design, object-oriented design).
')
Spoiler: eliminating all OOD violations leads to performance improvements, similar to the Aras transformations in ECS, besides it uses less RAM and requires fewer lines of code than the ECS version!

TL; DR: Before you conclude that OOP sucks, and ECS rules, pause and examine OOD (to know how to properly use OOP), and also understand the relational model (to know how to properly apply ECS).

I have been participating in a lot of discussions about ECS on the forum for a long time, partly because I don’t think that this model deserves to exist as a separate term ( spoiler: it's just an ad-hoc version of the relational model ), but also because almost every post, presentation, or article promoting the ECS pattern has the following structure:

Show an example of a terrible OOP code, the implementation of which has terrible flaws due to excessive use of inheritance (which means that this implementation violates many of the principles of OOD).
Show that composition is a better solution than inheritance (and not to mention that OOD actually gives us the same lesson).
Show that the relational model is great for games (but call it “ECS”).

This structure infuriates me because: (A) this is a scarecrow trick ... soft is compared with warm (bad code and good code) ... and that is not fair, even if done unintentionally and not required to demonstrate that the new architecture is good; and, more importantly: (B) it has a side effect - such an approach suppresses knowledge and inadvertently de-motivates readers from exploring the research that has been conducted for half a century. About the relational model first started writing in the 1960s. During the 70s and 80s, this model has improved significantly. Beginners often have questions like “ what class do you need to put this data in? ”, And in response they are often told something vague, like “ you just need to gain experience and then you just learn to understand with your gut ” ... but in the 70s this question was actively it was studied and a formal answer was derived for it in the general case; this is called database normalization . Discarding existing research and calling ECS a completely new and modern solution, you hide this knowledge from newcomers.

The basics of object-oriented programming were laid just as long ago, if not earlier ( this style began to be explored in the work of the 1950s )! However, it was in the 1990s that object-orientation became fashionable, viral, and very quickly turned into the dominant programming paradigm. There has been an explosion in the popularity of many new OO languages, including Java and (the standardized version ) C ++. However, since this was connected with the hype, everyone needed to know this loud concept in order to write it into their resume, but only a few really went into it. These new languages have created many of the features of OO keywords — class , virtual , extends , implements — and I believe that this is why at that moment OO was divided into two separate entities living their own lives.

I will call the use of these OO-inspired language features " OOP " and the use of OO-inspired design / architecture creation techniques " OOD ". All very quickly picked up the PLO. In educational institutions there are OO courses baking new OOP programmers ... however, the knowledge of OOD is lagging behind.

I believe that code that uses OOP language features, but is not following OOD design principles, is not OO code . Most of the criticisms against OOP use gutted code for example, which is not really OO code.

The OOP code has a very bad reputation, and in particular because most of the OOP code does not follow the principles of OOD, and therefore is not a “true” OO code.

Prerequisites

As mentioned above, the 1990s marked the peak of the “fashion for OO”, and it was at that time that the “bad OOP” was probably the worst of all. If you studied PLO at that time, you most likely learned about the "four pillars of PLO":

Abstraction
Encapsulation
Polymorphism
Inheritance

I prefer to call them not four pillars, but "four tools of the PLO." These are tools that can be used to solve problems. However, it is not enough just to find out how the tool works, you need to know when to use it ... It is irresponsible for teachers to teach people new tools, not to tell them when each of them should be used. In the early 2000s, there was resistance to the active misuse of these tools, a kind of “second wave” of OOD thinking. This resulted in the emergence of the SOLID mnemonic, which provided a quick way to assess the strengths of the architecture. It should be noted that this wisdom was actually widespread in the 90s, but did not receive a steep acronym that allowed them to be consolidated as five basic principles ...

The principle of sole responsibility ( S ingle responsibility principle). Each class should have only one reason for the change. If the class “A” has two responsibilities, then you need to create a class “B” and “C” to process each of them separately, and then create an “A” from “B” and “C”.
The principle of openness / closeness ( O pen / closed principle). Software changes over time ( i.e. support is important ). Strive to put the parts that are likely to change in implementations ( ie, in concrete classes ) and create interfaces (interfaces) based on those parts that most likely will not change ( for example, abstract base classes ).
Barbara Liskov substitution principle ( L iskov substitution principle). Each interface implementation must meet the requirements of this interface 100%, i.e. any algorithm that works with the interface should work with any implementation.
Interface separation principle ( I nterface segregation principle). Make the interfaces as small as possible so that each part of the code “knows” about the smallest amount of code base, for example, avoids unnecessary dependencies. This tip is also good for C ++, where the compile time becomes huge if you don’t follow it.
Dependency inversion principle ( D ependency inversion principle). Instead of two specific implementations that communicate directly (and are dependent on each other), they can usually be separated by formalizing their communication interface as the third class used as the interface between them. It can be an abstract base class that defines calls to methods used between them, or even just a POD structure that defines the data passed between them.
Another principle is not included in the SOLID acronym, but I’m sure that it is very important: “ Composite reuse principle”. Composition is the right choice by default . Inheritance should be left for cases where it is absolutely necessary.

So we get SOLID-C (++)

Below I will refer to these principles, calling them by acronyms - SRP, OCP, LSP, ISP, DIP, CRP ...

A few more comments:

In OOD, the concepts of interfaces and implementations cannot be tied to any specific OOP keywords. In C ++, we often create interfaces with abstract base classes and virtual functions , and then implementations inherit from these base classes ... but this is only one specific way of implementing the interface principle. In C ++, we can also use PIMPL , opaque pointers , duck typing , typedef, etc ... You can create an OOD structure and then implement it in C, in which there are no OOP language keywords at all! Therefore, when I speak about interfaces , I do not necessarily mean virtual functions — I speak about the principle of hiding implementation . Interfaces can be polymorphic , but most often they are not! Polymorphism is correctly used very rarely, but interfaces are a fundamental concept for all software.
- As I made it clear above, if you create a POD structure that simply stores some data for transfer from one class to another, then this structure is used as an interface - this is a formal description of the data .
- Even if you simply create one separate class with the common and private parts, then everything that is in the common part is the interface , and everything in the private part is the implementation .
Inheritance actually has (at least) two types - interface inheritance and implementation inheritance.
- In C ++, interface inheritance includes abstract base classes with pure virtual functions, PIMPL, conditional typedefs. In Java, interface inheritance is expressed through the implements keyword.
- In C ++, the inheritance of implementations happens every time the base classes contain something other than pure virtual functions. In Java, implementation inheritance is expressed using the extends keyword .
- In OOD, there are many rules for interface inheritance, but inheritance implementations should usually be considered as “code with a nice smell” !

And finally, I should show a few examples of the terrible learning of the PLO and how it leads to bad code in real life (and the bad reputation of OOP).

When you were taught hierarchies / inheritance, you might have been given a similar task: Suppose you have a university application that contains a catalog of students and staff. You can create the base class Person, and then the class Student and the class Staff, inherited from Person.

No no no. Here I will stop you. The tacit subtext of the LSP principle states that class hierarchies and algorithms that process them are symbiotic. These are two halves of the whole program. OOP is an extension of procedural programming, and it is still mainly related to these procedures. If we do not know which types of algorithms will work with Students and Staff ( and which algorithms will be simplified due to polymorphism ), then it will be completely irresponsible to start creating the structure of class hierarchies. First you need to know the algorithms and data.
When you were taught hierarchies / inheritance, you were probably given a similar task: Suppose you have a class of shapes. We also have squares and rectangles as subclasses. Should the square be a rectangle, or a rectangle square?

In fact, this is a good example to demonstrate the difference between the inheritance of implementations and the inheritance of interfaces.
- If you use the inheritance approach, you completely ignore the LSP and think from a practical point of view about the reusability of the code, using inheritance as a tool.
  
  From this point of view, the following is completely logical:
```
struct Square { int width; }; struct Rectangle : Square { int height; }; 
```
  The square has only width, and the rectangle has width + height, that is, expanding the square with the height component, we get a rectangle!
  - As you might have guessed, OOD says that doing so ( probably ) is wrong. I said "probably" , because here you can argue about the implied characteristics of the interface ... but oh well.
    
    The square always has the same height and width, so from the interface of the square it is quite true to assume that the area is equal to "width * width".
    
    Inheriting from the square, the class of rectangles (in accordance with the LSP) should obey the rules of the interface of the square. Any algorithm that works correctly for a square should also work correctly for a rectangle.
  - Take another algorithm:
```
 std::vector<Square*> shapes; int area = 0; for(auto s : shapes) area += s->width * s->width; 
```
    It will work correctly for squares (calculating the sum of their areas), but will not work for rectangles.
    
    Consequently, the rectangle violates the principle of LSP.
- If you use the interface inheritance approach, then neither Square nor Rectangle will inherit from each other. The interfaces for a square and a rectangle are actually different, and one is not a superset of the other.
- Therefore, OOD prevents the use of inheritance implementations. As stated above, if you want to reuse the code, then OOD says that the right choice is composition!
  - So the correct version of the above (bad) code inheritance hierarchy of implementations in C ++ looks like this:
```
 struct Shape { virtual int area() const = 0; }; struct Square : public virtual Shape { virtual int area() const { return width * width; }; int width; }; struct Rectangle : private Square, public virtual Shape { virtual int area() const { return width * height; }; int height; }; 
```
    - "Public virtual" in Java means "implements". Used when implementing an interface.
    - “Private” allows you to extend the base class without inheriting its interface — in this case, the rectangle is not a square, although it is inherited from it.
  - I do not recommend writing such code, but if you want to use inheritance implementations, then you need to do it that way!

TL; DR - your OOP class told you what inheritance was like. Your missing OOD class should have told you not to use it 99% of the time!

Entity / Component Concepts

Having dealt with the prerequisites, let's move on to what Aras began with - the so-called starting point of the “typical OOP”.

But first, one more addition - Aras calls this code “traditional OOP”, and I want to object to this. This code may be typical of OOP in the real world, but, like the examples above, it violates all sorts of basic principles of OO, so it should not be considered as traditional at all.

I will start with the first commit before he began to redo the structure in the direction of ECS: “Make it work on Windows again” 3529f232510c95f53112bbfff87df6bbc6aa1fae

 // ------------------------------------------------------------------------------------------------- // super simple "component system" class GameObject; class Component; typedef std::vector<Component*> ComponentVector; typedef std::vector<GameObject*> GameObjectVector; // Component base class. Knows about the parent game object, and has some virtual methods. class Component { public: Component() : m_GameObject(nullptr) {} virtual ~Component() {} virtual void Start() {} virtual void Update(double time, float deltaTime) {} const GameObject& GetGameObject() const { return *m_GameObject; } GameObject& GetGameObject() { return *m_GameObject; } void SetGameObject(GameObject& go) { m_GameObject = &go; } bool HasGameObject() const { return m_GameObject != nullptr; } private: GameObject* m_GameObject; }; // Game object class. Has an array of components. class GameObject { public: GameObject(const std::string&& name) : m_Name(name) { } ~GameObject() { // game object owns the components; destroy them when deleting the game object for (auto c : m_Components) delete c; } // get a component of type T, or null if it does not exist on this game object template<typename T> T* GetComponent() { for (auto i : m_Components) { T* c = dynamic_cast<T*>(i); if (c != nullptr) return c; } return nullptr; } // add a new component to this game object void AddComponent(Component* c) { assert(!c->HasGameObject()); c->SetGameObject(*this); m_Components.emplace_back(c); } void Start() { for (auto c : m_Components) c->Start(); } void Update(double time, float deltaTime) { for (auto c : m_Components) c->Update(time, deltaTime); } private: std::string m_Name; ComponentVector m_Components; }; // The "scene": array of game objects. static GameObjectVector s_Objects; // Finds all components of given type in the whole scene template<typename T> static ComponentVector FindAllComponentsOfType() { ComponentVector res; for (auto go : s_Objects) { T* c = go->GetComponent<T>(); if (c != nullptr) res.emplace_back(c); } return res; } // Find one component of given type in the scene (returns first found one) template<typename T> static T* FindOfType() { for (auto go : s_Objects) { T* c = go->GetComponent<T>(); if (c != nullptr) return c; } return nullptr; }

Yes, one hundred lines of code is difficult to understand right away, so let's start gradually ... We need another aspect of the prerequisites - in the games of the 90s it was popular to use inheritance to solve all the problems of code reuse. You had an Entity, an expandable Character, an expandable Player and Monster, and so on ... This is an inheritance of implementations, as we described it earlier ( "tactile code" ), and it seems that it’s right to start with it, but as a result it leads to a very inflexible codebase. Because in OOD there is the “composition over inheritance” principle described above. So, in the 2000s, the “composition over inheritance” principle became popular, and game developers started writing similar code.

What does this code do? Well, nothing good

In short, this code re-implements an existing language feature — composition as a runtime library, and not as a language feature. You can think of it as if the code actually creates a new metalanguage over C ++ and a virtual machine (VM) to execute this metalanguage. In the demo game Aras, this code is not required ( soon we will completely remove it! ) And serves only to reduce the performance of the game by about 10 times.

However, what does he actually do? This is the concept of " E ntity / C omponent" ("entity / component") ( sometimes for some unknown reason called the " E ntity / C omponent system" ), but it is completely different from the concept of " E ntity C omponent S ystem "(" entity-component-system ") ( which for obvious reasons is never called" E ntity C omponent S ystem systems ). It formalizes several principles of "EC":

the game will be built from non-featured “entities” (“Entity”) ( in this example called GameObjects), which consist of “components” (“Component”).
GameObjects implement the service locator pattern — their child components will be queried by type.
Components know how GameObjects belong to them - they can find components that are on the same level with them by querying the parent GameObject.
A composition can be only one level deep ( components cannot have their own child components, GameObjects cannot have child GameObjects ).
A GameObject can have only one component of each type ( in some frameworks this is a mandatory requirement, in others it is not ).
Each component (probably) over time changes in some unspecified manner, so the interface contains "virtual void Update".
GameObjects belong to a scene that can query all GameObjects (and therefore all components).

This concept was very popular in the 2000s, and despite its limitations, it was flexible enough to create countless games, then and today.

However, this is not required. In your programming language, there is already support for composition as a feature of the language — to access it, there is no need for a bloated concept ... Why, then, do these concepts exist? Well, to be honest, they allow you to perform dynamic composition at run time . Instead of hard-typing GameObject types in code, you can load them from data files. And this is very convenient because it allows game / level designers to create their own object types ... However, in most game projects there are very few designers and literally an entire army of programmers, so I would argue that this is an important opportunity. Worse, this is not the only way a composition can be realized at runtime! For example, Unity uses C # as its “scripting language”, and many other games use its alternatives, for example, Lua - a handy tool for designers, can generate C # / Lua code for defining new game objects without the need for a bloated concept like this! We will re-add this “function” in the next post, and make it so that it does not cost us a tenfold reduction in performance ...

Let's rate this code according to the OOD:

GameObject :: GetComponent uses dynamic_cast. Most people will tell you that dynamic_cast is a “code with a nice touch,” a big hint that you have an error somewhere. I would say so - this is evidence that you have broken the LSP - you have some kind of algorithm that works with the basic interface, but it needs to know different implementation details. For this particular reason, the code and "smells bad."
GameObject is not bad in principle if you imagine that it implements the “service locator” pattern ... but if you go further than criticism from the OOD point of view, this pattern creates implicit links between parts of the project, and I think ( without reference to Wikipedia capable of supporting I know from computer science ) that implicit communication channels are an anti-pattern , and they should prefer explicit communication channels. The same argument applies to bloated “event concepts” that are sometimes used in games ...
I want to state that a component is a violation of SRP , because its interface ( virtual void Update (time) ) is too wide. Using "virtual void Update" in game development is widespread, but I would also say that this is an anti-pattern. Good software should make it easy for you to think about control flow and data flow. Placing each element of the gameplay code after a call to “virtual void Update” completely and completely obfusts the control flow and data flow. IMHO, invisible side effects , also called long - range effects , are among the most common sources of bugs, and “virtual void Update” ensures that almost everything will be an invisible side effect.
Although the goal of the Component class is to enable composition, it performs it through inheritance, which is a violation of CRP .
The only good side of this example is that the game code climbs out of its skin, if only to comply with the principles of SRP and ISP - it is divided into many simple components with very small responsibilities, which is great for repeated use of the code.

However, he is not so good at complying with DIP - many components have direct knowledge of each other.

So, all the code shown above can actually be deleted. All this structure. Delete GameObject (also called Entity in other frameworks), delete Component, remove FindOfType. This is part of a useless VM that violates the principles of OOD and terribly slows down our game.

Composition without frameworks (i.e. use of features of the programming language itself)

If we remove the composition framework, and we don’t have the base Component class, how can our GameObjects use the composition and consist of components? As stated in the title, instead of writing this bloated VM and creating GameObjects on top of it on a strange metalanguage, let's just write them in C ++, because we are game programmers and this is literally our job.

Here is the commit in which the Entity / Component framework is deleted: https://github.com/hodgman/dod-playground/commit/f42290d0217d700dea2ed002f2f3b1dc45e8c27c

Here is the original version of the source code: https://github.com/hodgman/dod-playground/blob/3529f232510c95f53112bbfff87df6bbc6aa1fae/source/game.cpp

Here is the modified version of the source code: https://github.com/hodgman/dod-playground/blob/f42290d0217d700dea2ed002f2f3b1dc45e8c27c/source/game.cpp

Briefly about the changes:

Removed ": public component" from each component type.
Added a constructor to each component type.
- OOD is primarily about class state encapsulation, but since these classes are so small / simple, there’s really nothing to hide: the interface is a description of the data. However, one of the main reasons why encapsulation is the main pillar is that it allows us to guarantee the constant validity of class invariants ... or if the invariant is broken, then it is enough for you to examine the encapsulated implementation code to find the error. In this sample code, it is worth adding constructors to implement a simple invariant — all values must be initialized.
I renamed the too general “Update” methods so that their names reflect what they actually do — UpdatePosition for MoveComponent and ResolveCollisions for AvoidComponent.
I deleted three hard-coded blocks of code that resemble the template / prefab — the code that creates a GameObject containing specific Component types, and replaced it with three C ++ classes.
Eliminated antipattern "virtual void Update".
Instead of components looking for each other through the “service locator” pattern, the game explicitly links them together in the design.

Objects

Therefore, instead of this “virtual machine” code:

  // create regular objects that move for (auto i = 0; i < kObjectCount; ++i) { GameObject* go = new GameObject("object"); // position it within world bounds PositionComponent* pos = new PositionComponent(); pos->x = RandomFloat(bounds->xMin, bounds->xMax); pos->y = RandomFloat(bounds->yMin, bounds->yMax); go->AddComponent(pos); // setup a sprite for it (random sprite index from first 5), and initial white color SpriteComponent* sprite = new SpriteComponent(); sprite->colorR = 1.0f; sprite->colorG = 1.0f; sprite->colorB = 1.0f; sprite->spriteIndex = rand() % 5; sprite->scale = 1.0f; go->AddComponent(sprite); // make it move MoveComponent* move = new MoveComponent(0.5f, 0.7f); go->AddComponent(move); // make it avoid the bubble things AvoidComponent* avoid = new AvoidComponent(); go->AddComponent(avoid); s_Objects.emplace_back(go); }

We now have the usual C ++ code:

 struct RegularObject { PositionComponent pos; SpriteComponent sprite; MoveComponent move; AvoidComponent avoid; RegularObject(const WorldBoundsComponent& bounds) : move(0.5f, 0.7f) // position it within world bounds , pos(RandomFloat(bounds.xMin, bounds.xMax), RandomFloat(bounds.yMin, bounds.yMax)) // setup a sprite for it (random sprite index from first 5), and initial white color , sprite(1.0f, 1.0f, 1.0f, rand() % 5, 1.0f) { } }; ... // create regular objects that move regularObject.reserve(kObjectCount); for (auto i = 0; i < kObjectCount; ++i) regularObject.emplace_back(bounds);

Algorithms

Another major change has been made to the algorithms. Remember, at the beginning I said that interfaces and algorithms work in symbiosis, and should influence the structure of each other? So, the anti-pattern " virtual void Update " has become the enemy here too. The initial code contains the main loop algorithm, consisting of only this:

  // go through all objects for (auto go : s_Objects) { // Update all their components go->Update(time, deltaTime);

You can argue that it is beautiful and simple, but IMHO is very, very bad. This completely obfuscates both the flow of control and the flow of data within the game. If we want to be able to understand our software, if we want to support it, if we want to add new things to it, optimize it, run it efficiently on several processor cores, then we need to understand both the flow of control and the flow of data. Therefore, “virtual void Update” needs to be turned on.

Instead, we have created a more explicit main loop, which greatly simplifies the understanding of the control flow (the data flow in it is still obfuscated, but we will fix this in the following commits ).

  // Update all positions for (auto& go : s_game->regularObject) { UpdatePosition(deltaTime, go, s_game->bounds.wb); } for (auto& go : s_game->avoidThis) { UpdatePosition(deltaTime, go, s_game->bounds.wb); } // Resolve all collisions for (auto& go : s_game->regularObject) { ResolveCollisions(deltaTime, go, s_game->avoidThis); }

The disadvantage of this style is that for each new type of object added to the game, we will have to add a few lines to the main loop. I will come back to this in a later post in this series.

Performance

There are a lot of huge violations of OOD, some bad decisions were made when choosing a structure and there are still many opportunities for optimization, but I will get to them in the next post of the series. However, at this stage it is clear that the version with the “corrected OOD” almost fully meets or defeats the final “ECS” code from the end of the presentation ... And all we did was just take the bad pseudo-OOP code and make it follow the principles OOP (and also deleted a hundred lines of code)!

Next steps

Here I want to consider a much wider range of issues, including solving the remaining OOD problems, immutable objects ( programming in the functional style ) and the advantages they can bring in reasoning about data flows, message passing, applying DOD logic to our OOD code, applying relevant wisdom in the OOD code, removing these classes of “entities” that we end up with, and using only pure components, using different styles of connecting components (comparing pointers and the responsibility of carrying) components of containers from the real world, ECS-revision version for better optimization, as well as further optimization, not mentioned in the report Aras (such as multi-threading / SIMD). The order will not necessarily be such, and perhaps I will consider not all of the above ...

Addition

Links to the article have spread beyond the circles of game developers, so add: " ECS " ( this Wikipedia article is bad, by the way, it combines the concepts of EC and ECS, and this is not the same thing ... ) - this is a fake template that circulates within communities game developers. In essence, it is a version of the relational model, in which “entities” are simply IDs, meaning a shapeless object, “components” are rows in specific tables that refer to IDs, and “systems” are procedural code that can modify components . This “pattern” has always been positioned as a solution to the problem of excessive use of inheritance, but it does not mention that excessive use of inheritance actually violates the recommendations of the PLO. From here my indignation. This is not “the only true way” of writing software. The post is designed so that people actually study existing design principles.

Source: https://habr.com/ru/post/441174/

All Articles