The story of the victory at the annual competition Russian AI Cup 2017

Hello! I want to tell you about the story of my victory in the annual competition for writing game bots Russian AI Cup , in 2017. In the final, the bot won 98% of the games, which turned out to be the highest result in the finals among all the years of the championship. He also took the 1st place in the sandbox at the end of her work, at the peak passing for 4000 rating points.

This article may be interesting to participants, fans and just interested in the subject of AI and writing game bots. I hope you can learn something new for yourself. In turn, I would also like to read articles from the participants, to compare the approaches and the way of thinking.

The task of this year has already been written in detail here . In short: this is a war with a large number of units of different types: tanks, armored personnel carriers, repair vehicles, fighters, helicopters. Initially, 500 units of equipment are given, but it is possible to seize factories and produce new equipment without restrictions. I was immediately interested in this topic, because I used to like to play strategy games like Dune (still on the Sega prefix), Red Alert, Star Craft 1 & 2.
')
I have been speaking not for the first year, and not only in the Russian AI Cup, under the nickname GreenTea. Maybe someone remembers me from ants.aichallenge.org report .

About the approach this year. The approach was serious, you can even say professional :)

Do not miss the start, it is better to start immediately with the beta.
To use developments from past championships: these are all sorts of utility classes like points, vectors, segments, all sorts of handwritten mathematical functions.
Set up a local environment for testing the bot.
Take a vacation for a couple of weeks of the championship to get the most out of the process.
Work hard, donate almost all your free time. This is hardcore, yes, but what to do.

This year there were the most complaints from people, saying that “it’s very difficult”. Management is hard. It is necessary to write a lot of auxiliary code to teach the bot to do even the simplest things - this is so. For example, there was no such thing as an automatic search for a path for units. So, to indicate the end point for the movement, and no longer think, and the units themselves will go round all the obstacles. This was not. Units move in a straight line stumble on other units and stand ... Fighting stranded - it was a constant pain.

Further, the possible bot commands imitate actions that a person performs when playing strategy. Namely: selection of units in a rectangular frame, send selected units through the attack, assign / select a group, and so on. Plus, the number of actions is strictly limited. Performing actions in this form is much more difficult than directly assigning commands to each individual unit. This led to the fact that many of the participants did not come out in anything more difficult than simply collecting all the units in one pile, mix well and send to the nearest enemy. Later it evolved into the creation of a very neatly built phalanx (popularly called a sandwich), which slowly crawled on the enemy and it was very difficult to do something with it. But the essence remains the same. In the RTS, this is called a death ball, i.e. ball units that kills everything in its path. The organizers even introduced the possibility of a nuclear strike, so that such sandwiches did not live so easily - after all, in a heap, units are more vulnerable. But, as it turned out, the “right” sandwich is not particularly afraid of nuclear strikes, because repair machines have time to “cure” it.

But I would like to speak in defense of the organizers. I liked the subject of this year more than anything compared to previous years. Simply, it gives a very large scope for creativity. Judge for yourself: 500+ units of five different unique types, the ability to divide and combine different formations. Also a special thank you for simplified physics, the absence of such concepts as acceleration, rotation of units - all this is an unnecessary husk which adds complexity and routine to level ground, making it difficult to concentrate on the strategies proper.

However, the wealth of opportunities provided by the game, this is also an additional difficulty. I remember how the first few days after reading the rules, there was complete confusion in my head. How to get to the task? Only after some time, something began to line up in the subconscious.

Initially, it was clear that a sandwich is a bad job, because with the introduction of buildings that need to be captured, he simply does not have time to pack up and grab anything. Therefore, I started by dividing into groups. I decided that in each I will have 25 units. Total 20 pieces. Why such a size? This gives good mobility, it is difficult to cause great damage with a nuclear bomb. Also, 25 units in size fit well into the game cage (the field was 32 x 32 cells). A little later, the fighters were combined into one pack of 100 units, since in this form, they have the greatest impact power, and at the same time, almost no fear of nuclear strikes.

Next, we teach the group to move. It is important that when driving they do not stumble upon each other. I decided to approximate the group by a circle that contains all the units. Namely, the terms, because the collisions for circles are the easiest to count. Collisions can be considered in 2 ways:

Geometrically, comparing how much the segments of the path converge when moving groups. With this method I started.
Simulation: with a certain step, we imitate the movement of all groups, and check whether they have encountered. This method is better because takes into account the dynamics of movement. True, so that the simulation did not slow down, I had to use a step of 30 ticks.

Now the question is where to go. To do this, we use the potential field method (PP), the essence of which is that good positions should attract a group, and bad ones should push away. The size of the analyzed cell according to the PP method I took 32 by 32, because a group of 25 units is well placed in it and the weather and landscape map is divided into cells of the same size. Further, the question of how to assess the position, whether to go into it or not?
Below are all the parameters that I considered when evaluating a cell at the time of the final:

Position on the field: good to be closer to the center, bad at the edges and very bad in the corners.
Type of terrain and weather: these parameters affect the characteristics of units, so avoid bad weather and difficult terrain.
Distance to other groups: do not get too close to avoid collisions. When a collision is detected, we write the maximum penalty.
Fight with the enemy: can I destroy the enemy in this cell, or will he destroy me?
Factory capture: how close is the cage to the building that can be captured.
Intelligence: how many cells I can see while in this, given how long I have not seen them.
[for air units] Repair: it is good to fly up to repair machines and repair damage.
[for repair machines] Fix: drive up to broken flying units.
[for nuclear gunner fighter] Fly up to the biggest pile of enemies and direct a nuclear strike at it.

On the screen above, the visualization of the software for armored vehicles (orange), which was already written after the finals. This year I somehow did without visualization. I tried to figure out the constants for the formulas in my mind, I checked it in the debager, and somehow it all worked.

In general, the process of improving the bot was:

Carefully look at the games of the bot on the site or locally, and find places where the bot acts irrationally. If there are many such places, we write them down so as not to forget.
When choosing the implementation of improvement, we try to consider several ways, and choose the one that is, let's say, the “least fit”.
We are working on improving and checking with the previous version of the bot, as long as the new version does not consistently win more often. To check the stability, I took 20 different cards, drove them to the localrunner and counted the number of victories of the new version. I looked through all the fights to see for myself that the improvement works as it should.
We make a commit and save the new version as the last one with which we will test. Fill what happened on the site.
If a bug is found locally during the battle viewing, then due to the fixed seed and the lack of a random house in the engine, it can be immediately reproduced, upgraded and repaired. By the way, I have never used the repeater utility to search for bugs.

So that the game in the locale runner does not last forever it is important that the bot think quickly. Therefore, all sorts of caches and profiler should be your friends.

Well, we have potential fields. Next question is, in what order to move? Actions are limited. Which group should be like the first? The calculation is as follows:

We scan all the cells in a certain radius from the group and find the cell with the highest points in the bestScore .
We calculate the number of points in the currentScore cell where this group is already heading (if it moves), or the current cell (if it stands still).
scoreDiff = bestScore - currentScore

The group with the highest scoreDiff has the highest priority.
Thus, if we have previously assigned the group to go to the cell with a maximum of points and it goes there, then its scoreDiff will be equal to 0, and there is no need to order to go there again, which is logical. Also points are proportional to the number of units, which means that the larger the group, the higher its priority will be.

A few words about the combat system. It rather primitively simulates the outcome of the battle, with a given number of units of equipment on both sides. And if we assume there are 2 tanks with a strength percentage of 0.5 both, then the combat simulator simply summarizes all the HP and considers that it is 1 whole tank. The fight lasts rounds. Each round attacks each side, and the damage is deducted. And if there are several types of units in an opponent group, then it shoots at each type in turn. The battle ends when the damage inflicted on both sides is 0, which happens, for example, when one of the groups kills the other, or tanks remain on one side and fighters on the other, and they cannot attack each other. As you can see, the system is not quite accurate, but it is very fast, and in most cases quite adequately predicts the outcome of the battle. For the combat simulator wrote unit tests. By the way, a very rough performance test shows that the battle, where on one side is 25 tanks and 25 repair machines, and on the other 25 tanks, is calculated 100,000 times for 519 ms, or 19 errors for 1 ms, which is pretty good. Attempts to at least slightly increase the accuracy of this system led to an exorbitant increase in the calculation time, and I scored on this.

When evaluating a battle in a cage, a battle radius of 2 groups is taken. On the one hand, everything is mine that is in the radius + units of the group, on the other hand - everything that the enemy has in the radius of the battle. And it is fed to all the simulator of the battle. At the output we get a structure with information: damage minus losses, the number of units on both sides after the battle, the final round.

In addition to the outcome of the battle, the estimated cell also took into account the route of transition to this cell in a straight line. If this route is dangerously close to the cells with enemies, then the combat simulator is launched for them, and the result is summed up all the way.

And so that the bot acted more carefully in the battle, and attacked only for sure, the unfortunate outcome of the battle - negative points multiplied by 10. And only after the 10,000th tick, if the bot sees what is winning, then a special aggressive mode is activated, in which the opposite is encouraged killing the enemy, and protection is reduced to a minimum.

Initially, the bot functioned only on the PP. The first bell that they did not quite cope in some situations was that helicopters did not run well from fighter jets. Of course, this is a feature of my implementation, but I think because of the fast movement of fighters, helicopters also need to act very quickly and accurately, hiding in advance for bmp. I had to add a couple of crutches to the PP to fix it.

With the advent of buildings, especially factories, which produce units at a tremendous speed, it became clear that the PPs frankly fail to cope with fast and optimal capture. And the factories need to seize without delay, tk. The production rate of units is very high, and capturing them early can literally crush a superior number.

The PP is still bad because it is necessary to select all sorts of formulas and magic constants so that everything works. All fields are summarized as a result, which means they influence each other. By adding a powerful field to the capture of factories, this could lead to the fact that the units will strive to capture even in the face of danger, which is completely inappropriate.

It is very easy to break everything by inaccurately correcting some coefficient. It seemed much easier to write a separate strategy, which would be responsible for the seizure of factories, and would have a globally higher priority than the PP.

Then there were a few days of thinking - inventing some new architecture. I wanted her to allow beautifully, without crutches, without cluttering up the code to embed everything that does not fit into the PP. I remember how I walked around the apartment for about an hour in circles, and went through different ways of organizing code in my head.

So, it was:
Bundle of if
PP
Bundle of if

It became:
Strategy 1
Strategy 2
...
Strategy n

Each strategy is inherited from the class BattleStrategy, has the type enum BattleStrategyType, must implement the method abstract boolean apply (Move move);
All strategies exist in the 1st instance, are contained in the list and are invoked sequentially, from higher priority to lower priority. If the apply method returns true, then the strategy has made a move, and then we don’t consider the strategy further. If false, then the current strategy does not consider it necessary to apply at this tick, and you can try the next one. Next, I cite the entire list of strategies at the time of the final, from more to lower priority.

AvoidNuclearStrike - avoiding nuclear strikes.
CreateGroups - the creation of groups of free units in factories, when certain conditions are triggered.
CreateScouts - the creation of reconnaissance fighter aircraft, which can serve as a nuclear bomb gunner.
DoNuclearStrike - nuclear strike.
FactoryProduction - ordering the production of units of the desired type at the factory.
JoinGroups - combining 2 groups into one if the total number of groups has increased too much.
Pack - consolidation of units in a group, so that its radius decreases, this makes the battle more effective.
TurnToEnemy - turn the group in broad part towards the enemy.
JoinCollidedGroups - join groups that accidentally bump into each other and get confused.
CoptersStrategy - the strategy of escaping helicopters for bmp in case of danger from enemy fighters.
FightersStrategy - strategy chase fighters for enemy helicopters, as well as the protection of tanks and repair machines from helicopters.
FactoryDefence - protects factories from being captured when the enemy approaches them too closely.
FacilityCapture - capture of factories, with the possibility of reassigning capture targets if the situation on the battlefield changes
KillProduction - the destruction of defenseless enemy units, which are produced in the factory.
PotentialField - That is the oldest PCB, which has practically not changed since the 1st round. As you can see, is the least priority strategy.

In the meantime, all these strategies were written, before the second round, I calmly watched as my bot spreads opponents to small groups without any problems in the “with buildings” mode. And it seemed that no one even close could oppose anything and in the same way it would reach the final. Until ud1 appeared. He did not divide the detachments so small, but used only 5 pieces, 100 for each type of unit. I remember how in a series of fights with me - I won everything. I was shocked. As so, began to understand. It turned out that the organizers cut down the capture speed of the factories, and my small groups seized them too slowly. I redid the capture strategy of factories so that at the same time it could capture 2 groups at once, and the situation against ud1 seemed to be normalized for a while. But after a few days, twenty-five again! After a careful study of the repetitions, I came to the conclusion that my groups were too afraid of his large groups, constantly retreating, factories already captured, and the factories do not have enough to intercept him, because there already tanks are built ... Just the same evening, I decided not to share so small. And he made 3 big packs of 100 units each from tanks, bmp and repair machines. Initially, I was stopped from such a move by the fear that large groups would be highly vulnerable to a nuclear strike. But, as practice has shown, the seizure and retention of factories at the initial stage affects the outcome of the battle more. Plus, in the factories, too, began to build tanks, as a more versatile unit. The helicopters, though more powerful against the ground, are slower built and too vulnerable to fighters. Tests with the previous version of your bot showed the complete domination of the new version.

In parallel with writing strategies, I gradually improved the architecture. One of the innovations is the preservation of information about the current implementation of the strategy within the group object. For example, when seizing a factory, you can save it to which factory. But the most useful thing I keep there is a strategy trigger.

public interface GroupStrategyTrigger { boolean isShouldApply(); boolean apply(Move move); }

The isShouldApply () method is called by the strategy for all groups that are currently implementing this strategy and who have triggers. If isShouldApply () returns true, then apply is called. If apply returns true, then the strategy has made its turn from the trigger, and the turn ends. If after execution of apply the trigger has not changed, then it is deleted. A trigger can return false from apply () and serve purely as a switch for the group's strategy, when some condition occurs so that it can again be considered lower-priority strategies. This all allows you to perform complex chains of actions, stringing triggers like beads. For example, to make a movement by a group is not so easy, first you need to make sure that it is selected.

 selectAndCallTrigger(movedGroup, move, new SelectGroupActionStrategyTrigger() { @Override public boolean apply(Move move) { performMoveAction(movedGroup, target, target, move); } });

The selectAndCallTrigger () method, which is commonly used, checks whether a group is selected. If yes, the trigger is executed immediately, and if not, the CLEAR_AND_SELECT action is performed on this tick, and the transferred trigger is on the next when the time comes to make a move, and management will reach this strategy. If, after selecting a group, a higher priority strategy intercepts the control and also makes a selection of another group, then the SelectGroupActionStrategyTrigger trigger of the current strategy is deleted, but this does not happen very often. This is just the simplest example. In the union of groups, for example, 3 nested triggers are used. The advantage of this approach is that all the logic related to the strategy is located in one separate file, there are no heaps of if-s and it looks like a sequence of actions separated by conditions in isShouldApply (). And as a result, it allows you to quickly sausage new strategies without breaking or affecting the rest.

The disadvantage of this architecture is that all strategies have a fixed fixed priority, although the priority in some situations is not so obvious. For example, what is more important, to defend your factory, or to attack the enemy? I have protection in priority, but it may turn out that the group to be used for protection will capture more quickly. For this situation, I had to write a crutch that allows me to look from the protection strategy to the capture strategy, and not to touch the capture if it is more promising. But in general, this situation is rather an exception.

In order not to fall on a dangerous enemy, performing maneuvers in strategies, and not crashing into one's own, part of the calculations were used, which were written for the PP. For example, the following function was commonly used:

 public boolean isCanSafelyMoveToPoint(Group g, Point target) { prepareCachesForGroup(g); return !isHasCollisions(g, target) && !isGroupInDanger(g) && calcBattleScoreOnPath(g, target) >= 0; }

We go for example to seize the factory, and every move we check all the groups that move using the trigger - “is the path to the goal free and safe?” If not, then finish the strategy for the group.

After the introduction of the fog of war, almost did not alter the bot. He added only memorization of the positions of enemy units, where he had seen them before. In scout fighters, he added an additional priority to “enlighten” cells with hidden enemy units. If intelligence showed that there are no more hidden units in the cage, then we completely remove them as if they were killed.

Other Tricks

In order that the bot did not immediately spend all the actions, and then did not sag without commands, it was done that the allotted actions were spent evenly over 60 ticks.

When the enemy has 100 ticks for the next nuclear strike, each next action is performed 1 tick later than they could. This helps to “reserve” a certain number of actions before a nuclear strike to move the units apart.

Movement on the course of the enemy. If the PP orders to go into the cage, and there are enemy units in it, which the group can attack, then the movement is not in the center of the cage. According to the average vector of movement of enemy units and the distance to them, a point is predicted along the course of its movement, in which they can be intercepted. And the group is sent immediately to this corrected point.

In combat, it is important that the units stand as closely as possible pressed against each other, without voids. This gives a more concentrated fire. The weather and the landscape affect the speed of the units, and if nothing is done the dense initial formation of the group gradually splits, stretches ... To determine the density, you can divide the area of the circle containing the group by the total area of the units. Result to 3 - I considered tolerant. From 3 to 5 - sparse. Greater than 5 - very thin.

You can fight this in two ways:

Periodically make the SCALE team inside the group. Sometimes it works badly, because the units are lined up in long sausages that block movement.

Then after compression, you can perform a spin around the center of the group, and then compress again. So repeat several times. But even this does not always help, since units tend to get stuck even during rotation.
Initially, while the group is compressed, always move it with the minimum speed on the current path of movement. Those. for example, if the path of fighters runs only through clear-weather cells, then the speed is not limited, if there is at least one cell with a thunderstorm, then immediately when moving, the speed is set for thunderstorm weather. With this approach, the group is almost guaranteed not to fall apart. I used this approach for air units. But not for the earth, because the maximum speed is important for the earth, for quick capture of buildings. Aviation of the building does not capture speed and it is already enough.

When choosing to capture buildings, priority is given to the factories. It seemed to me - the command centers do not have too much influence on the battle. Moreover, ground combat units have an even greater priority in capturing factories. Thus, if we assume there are several command centers near the launch site, and there are several factories closer to the center, tanks and BMP will go to seize factories immediately, the more vulnerable repairmen will remain to seize command centers and arrive later.

Each turn, a building capture strategy builds a “optimal” capture plan. “Optimal” is in quotes, because it is very rough and approximate optimality, and it could be improved for a long time. Then the strategy checks with the current execution, in descending order of priority of capture. If a deviation from the “optimal” plan is found, then the implementation is immediately corrected. Thus, the capture strategy is sensitive to the changing situation on the battlefield.

With the capture of enemy factories, a group of capture becomes clinging to the corner, where the produced enemy units appear. This does not allow new technology to accumulate.

In general, new units, while they have not yet gathered in the group, are a convenient target. Therefore, there is a separate strategy for the murder of such free units. Its essence is to become a group on the production line. It is very convenient to set free fighters in such a way, blocking the enemy’s production of his fighters and helicopters.

What is missing this bot to achieve perfection?

No synchronous attacks in multiple groups. Suppose if I have 2 groups of 30 tanks each, and the enemy has 1 group of 40 tanks between them, then they will each be afraid. And ideally, they should have simultaneously attacked from different sides. It would seem quite rational behavior for a person, but never realized how easy it is to program. I didn’t want to spend too much time on this maneuver, so I put it off until later, but my hands didn’t make it.

Insufficient optimal capture of factories. Essentially, I have a greedy algorithm. It would be possible to reduce to something like the unopened task of a traveling salesman, and find a plan to capture all the factories, and not just the closest ones. But this would give some kind of improvement only on maps where there are many factories, but this is not such a frequent case, so I did not bother.

Bad micro in battle due to the primitive battle analyzer. This was clearly demonstrated by the Milanin bot, which is 2 heads taller than everyone else in terms of peace. He, controlling phalanxes of 20 tanks and 20 repair machines, sometimes manages to shoot out superior forces without loss, skillfully moving and rotating these phalanxes with tanks towards the enemy. Unfortunately, Milanin has a very slow initial card capture due to the assembly of those very mixed phalanxes. Therefore, even the most top-notch bot can crush it with purely the number of units produced in the more quickly captured factories.

Conclusion

As a result, in my opinion, I got a pretty well-knit bot, with the absence of frankly weak points. When I watch replays, I do not notice too critical errors. Something beyond the genius in it. Before the start of the finals I was not sure that I would win, because in test games with other tops, such as Adler, oreshnik, Milanin - the results were unstable. But, as it turned out, it was very unexpected for me - the first place by a large margin.

The source code of the bot is available here .

Source: https://habr.com/ru/post/345458/

All Articles

The story of the victory at the annual competition Russian AI Cup 2017

More articles: