Load testing in Skyforge, or Bots - server orderlies. Part 1
Hi, Habr! My name is Alexander Akbashev. I am the QA engineer of the Allods Team on the Skyforge project. My responsibility is to organize the testing of the server of our game, and this article will be about testing the server. In May, I gave a talk on the GDC, which turned into a two-part article. This article is the first one.
Aerial view skyforge
The technological stack of Skyforge is as follows: the game client is written in C ++, the game server is developed in Java, many different useful utilities and scripts are written in Python. We also have tools for designers written in C #, but the article will not have another word about C #. :)
What are bots and why are they needed?
During the development of the project Allods Online, the server team faced a serious question: how to conduct complex and load server tests? We decided to write for this bots - a standalone client application that emulates the behavior of a real player. Since then, of course, much water has flowed under the bridge. The bots used by Skyforge have little resemblance to the very first little bots. The purpose and infrastructure of tests conducted with their use have also evolved: now bots help us to get a huge number of server characteristics for analysis. This is due to the fact that bots tests are included in our continuous integration process on our project. ')
Such tests are designed with a focus on maximum stability. That is, if during the regular test something is painted over, but nothing critical happened, the testing continues. In this, our bots tests are fundamentally different from unit tests, which should fall at the very first sneeze. At the same time, the repair of the botox is not less priority.
What are the bots?
Our bots are implemented by the so-called “white box method”: the client, written in C ++, has the interface removed, and the “brain”, also written in C ++, is added instead. Moreover, the general code base of the client and bots allows you to test fairly low-level things in the client itself.
The “brains” of bots are implemented using a finite automaton — a graph of transitions from one state to another under certain conditions. Also, bots have permanent active elements that act regardless of the state in which the bot and its “brain” are now. They support, for example, the immortality of bots and chat activity. With bots, you can correspond in a personal or general game chat - they are always ready to support the QA team, quoting various stories from the bashorg.
An example of the simplest transition graph: a bot appears on the map, runs, sees a monster and kills it. If there are trophies - raises, if not - runs further.
For testing with more flexible settings, so-called bot scripts are used. They allow us to define the desired behavior of the bots: choose the “brain” to be used, the place and conditions of appearance, the class of the character, spells, abilities, activity. It turns out a certain portrait of the user, which is this bot. After that, it can be given to testers to supplement load testing or to conduct some other tests.
At the very beginning, the main focus of bots was on generating the relevant load. However, it turned out that the use of bots is beneficial in other areas.
For bots to be so useful, they must meet two very important requirements.
The first is the adequacy of behavior. Bots must match players by behavior. If it so happens that the bots do not comply with the requirements of game design, then they need to be repaired. For example, a bot, running through a certain area, must kill as many monsters as a real player in the same conditions.
The second requirement is relevance. Let's say you made bots that shoot a bow and kill monsters one by one. But the evil game designers decided that in this case, instead of a bow, there should be a rocket launcher that shatters several monsters at a time. Accordingly, bots need to be reworked for these requirements, otherwise the load and content will be irrelevant.
Tests solve
At the beginning of our work with tests, the focus was on the load. We sought to maximize the available infrastructure and use all the capabilities of bots. Then the thought was not that bots can bring so much benefit. In the future, we began to look for problems that can be solved using the created infrastructure of bots tests. For example, bots help check group activities. A tester can add a bot to the group and go on a group adventure. This greatly facilitates the verification of such processes.
Server check
Thanks to continuous integration, several times a day we collect a fresh version of the server. However, there is no certainty whether the build is good or not and whether it can be used. Testers are not always able to immediately update the client and check the new build. To do this, now use the smoke test, which checks the latest build using a bot. This is a fairly simple bot: it enters the server, searches for a monster and kills it. If everything went smoothly - the server is working and you can play on it.
Rehearsal of the main test
We have a night test that goes 8 hours and allows you to collect a lot of valuable information. Unfortunately, the test iterations are very expensive, and in order not to miss the launches of this test because of errors that can be fixed in 5 minutes, a separate daily test is conducted. It runs on the same version of the code and the same data as the night one, but only for an hour. Thus, during the day rehearsals of the night run are held. If necessary, we have time to correct various errors before the end of the working day. Passing the hourly test by 99% guarantees the passing of the night test.
Also, hourly tests are used to experiment and test various hypotheses. For example, if you want to play with the keys in a Java machine, use the hour test. You can run and compare two sets of keys. You can also compare two different algorithms or two different sets of maps.
Content Verification
Skyforge content is quite voluminous. We have a lot of objects of the game world. Therefore, when the night test is launched, it is not always clear which design elements lead to additional load. It is not always clear why everything is good on one game server with a certain set of cards, but on the other and with a different set of cards everything is bad. Is the card at fault, the number of players or the phase of the moon GC? As a result, individual card tests were added. They allow you to compare the game in a sterile environment. For example, if you take three maps and run them on the same versions of code and data, we will get different comparative characteristics and temperature maps, which show where the critical places may arise. These tests are probably the most useful thing we have for profiling game content.
An example of a temperature map (on the scale - a certain performance rating, calculated in bots)
In addition, we use temperature maps to analyze a large number of characteristics that have coordinates: the death of characters, the number of monsters around, the number of monsters in battle, etc.
The most important test
Our favorite test is the night bots test, during which all final content is checked. It lasts, as I mentioned, more than 8 hours. This is the main load test, which I will discuss in detail in the next article.
Conclusion
Obviously, bots tests work at the junction of all project elements: the server, the client is started, the resource system, the database, the master server, the build agent are used ... And if at least one of these elements breaks, the whole test can be considered failed. It is necessary to make great efforts so that the whole system works like a clock.
Bots are quite difficult to maintain, but extremely useful. In the second part of the article I will tell you what infrastructure we have deployed around the bots tests, what server parameters are monitored and what they are talking about.
I hope this article about bots was interesting to the community. Thanks for attention!
Slides and videos
Below are the videos and slides from the KRI-2013 conference.