In December, the Russian AI Cup 2016 was completed - the annual artificial intelligence programming championship organized by us. Championship for clarity, clarity and simplicity is held in a game format.
This year, participants created an algorithm - a game strategy for a MOBA game. The resulting bot fought with others of the same, and the best of them won the round. Thus, a series of rounds turned out a tournament that takes place in several stages.
The alignment of the main prizes at the end of the championship looked like this:
Two of the participants have already posted articles on Habr based on RAIC2016 - 1 , 2 .
In addition, there is a special thread on the championship forum, where participants shared their experience in writing strategies and told about the key points of their algorithms. Some of the algorithms turned out to be very interesting, and we are in a hurry to share them with you. We also selected several understandable articles for you explaining the key principles on which strategies are built.
It is very difficult to consider a game strategy without knowing the rules of the game. If you have never heard of the Russian AI Cup, then we recommend to see our first article about it. Or look at the short rules under the spoiler.
To better understand the mechanics of the process, read the brief rules of the championship. You can view the full version on our website .
The game world is two-dimensional , and all units in it are round. The game area is bounded by a square, the upper left corner of which has coordinates (0.0, 0.0), and the side length is 4000.0. No live unit can leave the game area.
Time in the game is discrete and is measured in ticks. At the beginning of each tick, the game receives from strategies the desired actions of wizards in this tick and updates the state of wizards in accordance with these desires and limitations of the world. Then the calculation of the change in the world and objects in it for this tick occurs. The process is repeated again - with updated data. The maximum duration of any game is 20,000 ticks, however the game can be terminated early if the team goal of one of the factions is achieved or if the strategies of all the participants have “fallen”. The "fallen" strategy can no longer control the wizard.
Detection of units on the map is limited by the fog of war . The participant's strategy will only receive data about those units that are within the range of view of the wizard himself or any other unit from his faction.
In the world of Code Wizards, there are six classes of units , some of which, in turn, are divided into types: wizards; projectiles (magic missile, ice arrow, fireball and dart); bonuses (gain, acceleration and shield); buildings (faction base and guard tower): minions (orc-woodcutter and fetish with darts); trees
Wizards, buildings, minions, and trees are living units . Their main characteristics are the current and maximum amount of vital energy. In the general case, when the vital energy drops to zero, the unit is considered dead and is removed from the game world. Wizards - the only living units with the ability to regenerate health. For each tick, they automatically restore a certain amount of vital energy. The regeneration rate is a real number, as a rule, less than one. A wizard is considered lost if the integral part of his vital energy drops to zero.
Every 750 ticks, the base of each faction generates three detachments of minions : one per track. Each squad consists of three orcs and one fetish. The detachment immediately rushes along its path to the base of the opposite faction, attacking all opponents on the way. Wizards use minions as cannon fodder, try to keep themselves in a safe zone and attack the enemy from a distance.
With a certain probability, neutral minions can appear in forest areas. Usually they are not aggressive, but when they damage one of them, all neutral minions in the vicinity rush to the offender, attacking anyone who stands in their way.
Every 2500 ticks on the card may receive a bonus . If there is already at least one bonus on the card, then a new one will not appear. The bonus is created at a randomly selected point of the two possible: (1200, 1200) or (2800, 2800). If any part of the bonus appearance area is already occupied by the wizard, then the simulator will try to create a bonus at the second point. If unsuccessful, the creation of a bonus will be postponed until the end of the next interval.
The collision of live units between themselves and with the borders of the card is not allowed by the game simulator. If the distance from the center of the live unit to the center of the projectile is less than or equal to the sum of their radii, then the live unit takes damage and the projectile is removed from the game world. In doing so, the fireball explodes and causes damage to all living units nearby. If the distance from the center of the wizard to the center of the bonus is less than or equal to the sum of their radii, then the wizard for 2400 ticks acquires a magic status depending on the type of bonus.
And then - interesting algorithms from the words of the participants themselves:
In my strategy, like many, positioning in combat works on potential fields . This is the simplest thing that could come to mind. What it looks like:
(4x acceleration)
At each point of the map are attracting (+) and repelling (-) fields. Linear functions depending on the distance to the point / segment. Here are the main ones:
(-) enemy wizards
(+) enemy wizards that need to be finished
(-) enemy towers
(+) enemy towers that need to finish
(+) (-) enemy minions to retreat when too close and run up for melee
(-) allied units: just in case there is room for maneuver
(+) place behind the enemy throne
(-) respawn minion points
(+) Point indicating the direction to the bonus: so as not to run towards him at breakneck speed, but to take into account dangerous zones in battle
(-) forest (just repulsive triangles, so as not to go there once again)
(-) map walls and corners
(-) places where they can clamp between the wall and the tower
At each tick, an offset was chosen to the point where the potential is greater. The best I’ve come up with to avoid getting into local maxima is to take into account not only the next tick, but 15 ticks ahead (in the chosen direction). The further the prediction, the less it affected (exponentially decayed).
Dijkstra's algorithm was used to find the path, the code of which moves from year to year. The entire map is marked up with an 80 × 80 grid. You can navigate in eight directions (adjacent diagonally and laterally). The fins on which there are trees were fined in proportion to the health of these trees. Further, after the path is built, it is “smoothed out” to remove the effects of the partition into the grid, and a list of trees to be cut is built.
It also took into account the penalty for moving to another Lane, entering deep into the forest, the number of enemies along the way, the final position (as it is advisable to stand on the line behind your minions, and not on the side of the trees).
For the first round, a more or less finished version of the dodge was used. Moved 20 directions of departure, and then, whether at the same time to turn. Then it seemed that everything was taken into account, but later I rewrote everything two times and almost every day I found bugs and shortcomings. If several projectiles or Fireballs are flying at the same time, then a departure was chosen with minimization of the total damage.
The main branch of pumping became Fireball. I went through the angle from which he would be released, and determined what range he should fly. Also, the Allies were given info about the firebolt just released. Theoretically, this could be done in a normal way, but I did it through “dancing”: I rounded move.Turn to five characters, the other three decimal places were at my disposal.
The second branch of the rocked Range, but for the finals made only RangeBonusPassive1, then Haste.
For the finals to the last did not want to use the scheme 0-5-0. Began to try with the scheme 2-2-1. At the beginning of the final week, it even won consistently (for everyone except Commandos, Antmsu, Recar: 50 to 50 with them), but I understood that this was not for long. Since 0-5-0 did not go, I decided to develop what is. As a result, in the final, I played the version from 1-3-1 with rebuilding “according to the situation”.
→ Code
1st round. I went only to the center due to the fact that it is more convenient to take bonuses plus the line is wider. Since the strategy was implemented through potential fields, in order to avoid lagging between bonuses, a weak attraction force for each of them began to act 700–800 ticks before their appearance.
Depending on the location of the minions, the enemy and allied wizards, as well as the enemy tower and its cooldown (I did not take into account these dependencies directly), my wizard chose one of the parties to take the bonus. And for about 300–400 ticks, the weaker bonus turned off.
Near the bonus itself, when there was my wizard and another allied one (if there were more enemies, I still went for the bonus to the last and often died, since the bonus was considered a higher priority than everything else), I used a small “killer feature” ": I got up at the point between the bonus and the allied wizard, and when he tried to get around me to get closer to the bonus before his appearance, I moved along a smaller radius to be back between the allied wizard and the bonus. If there were more allies, then it was impossible to do so.
Also in the first round, he implemented the dodgers, at that time they worked well: most of the wizards fired exactly at the center, and it was easy to run off during the flight of the projectile to the side. He began to run at the moment of the shot, but not when he saw the enemy magic missile.
In fact, the more advantageous point for shooting is shifted by about 7-8 pixels from the center towards the turn of the wizard (due to the difference in the speed of movement in different directions). Later, in order to choose the best point to shoot (within one wizard), I simply went through the corners, starting from the center of +/– a couple of degrees, just for the case when the enemy wizard is sideways at the moment of the shot.
At some point, many began to dodge in the top 30 sandboxes, and I also noticed that many are starting to dodge the magic missile at the moment of the shot. Initially I wanted to collect statistics, who runs a tick earlier, to dodge, and who does not (it depends on whether it is worth shifting the shooting point to move.speed). But he came up with a simpler solution, which was the chip giving such an advantage in micro. At that moment, when I could shoot at an enemy wizard, I did not shoot, but waited for one tick. In most cases, this allowed me to guess the direction of movement in the next moment and get into the evading wizards from a little more distance.
It is also worth saying that from the very beginning I wrote my own strategy visualizer, without which debugging would take much longer.
2nd round. The Fireball spell had an indisputable advantage in terms of causing the amount of damage per unit of time, and besides, you can dodge it only if you stand very far away. Also from the first round, I implemented a melee with minions, so this branch was just perfect compared to others.
Accordingly, the enemy main building was able to destroy, on average, a little faster than the enemies did. At the time when I just learned to use this spell, in 35 games out of 42 I won (in 10 * 1 + mode). Of course, then this branch began to be chosen by many, and there was almost no advantage.
For the rest, he began to go for the bonuses more carefully, to die less often on them, plus at the beginning of the game he chose the line, where he had the least number of allies, in order to gain experience as quickly as possible. The micro at that time was not yet finalized, so I did not choose another development branch with the numerical advantage of the enemy on the line, although this would have brought a little more points in the round.
Most of the time of the second round I spent on simulating all the units on the map to predict whether it is worth going to defend the base or whether it is more profitable to convey the enemy, and also to determine who will shoot the tower at.
The simulation worked quickly and more or less accurately, but not enough, and in the end I took advantage of it only to throw the Fireball with a lead in the minions for inflicting maximum damage.
Inaccuracies appeared due to the fact that I did not finish the wizards' behavior. And besides, the balance turned out to be such that it was rarely beneficial to go to defend oneself. Also, this simulation could be used to predict whether it is worthwhile to attack or retreat in a mass battle. In fact, this is solved quite well with the help of the less resource-intensive method — the Influence map, which takes into account the effective number of lives, the cooldowns of shots, the auras, the characteristics of the characters, but I managed to implement this method even to the final.
The final. Right up to the last day, I didn’t want to go to the 5th by the midpoint, so I tried 1-3-1. Against the active rush I tried to minimize the time of passage on the side (here the melee branch was the most profitable), and it turned out to be done in about 6 thousand ticks.
Literally one day before the final, we managed to correct several shortcomings that were present in the strategy from the very beginning: get up in one line in a mass brawl, do not go for bonuses with all wizards, move correctly at increased speed, take into account the effect of speed auras on enemy units to make less misses when fired, and do not chop off extra trees.
These corrections allowed to fight well 5 Ă— 5. Plus, the micro improved slightly due to different coefficients. I also noticed one feature: the wider the construction, the better on average it was possible to conduct battles, but most likely this was due to the fact that the command selection of the target was not realized and one enemy wizard my five were afraid just like one ( kept at a distance of approximately enemyCastRange + 20).
“Artificially” I increased the level of aggression depending on the number of enemy wizards on the line with me. Before the second part of the final, I couldn’t manage to get into micro NighTurs, so if several conditions were met by one of the wizards, I would go to another line in the hope that he would destroy the enemy’s building faster than my opponent’s five wards would push the four of them along the center line.
But in the end it worked with a bug, plus luck was not on my side, because before the final with the same versions against NighTurs the score was 5: 0 in my favor, but during the final - 1: 4 in favor of NighTurs (after the final - 2: 0 again in my favor).
After the finals there was no joyful feeling of the winner: it seemed that the strategy was far from ideal, and the gap from the second and third place was so small that the victory was largely due to luck.
And the last “killer feature”: during the years of studying at the university I played a lot of DotA and was not bad at it :) Of course, I joked about the feature, because the playing skills gave only heightened interest at the beginning of the competition.
The basis of the algorithm is potential fields: we look at 32 points around + below us, we are moving in the best direction. Initially, the points were not taken at the same distance, but at the maximum step, but I turned in the same direction (and not the direction of the gradient), and the bot very reluctantly walked diagonally. As a result, I began to always walk in the direction of the antigradient, although now I already think that it was only necessary to turn there, and go to the best available point for the maximum step.
In order not to fall into a local minimum when walking (he turned off in battle), he added "hills" to his past positions. The idea is taken from the video:
In principle, I had four types of fields: the current “goal” (waypoint, bonus), “obstacles” - the field grew quickly around them, but almost did not grow at a distance, “danger” and “target”. The last field turned off the “target” field and was a tricky formula that took into account the position of the target and the point of retreat (so as not to run into the forest).
Danger was calculated for all units differently, but took into account my speed, their cd, attack range, current minion targets, current tower targets, magicians, and turn time for me.
T1 - the point of retreat, T2 - the goal to which it is drawn, plus its danger
Like many, for optimization I “chopped off” the world, leaving only significant obstacles and objects.
I walked along the waypoints + Dijkstra's algorithm, but upgraded them to points that had the "influence" of the teams and the "owner". To capture a point, it was necessary to hold the influence over it for a certain time with a certain force.
Dodges: the most interesting and problematic. I had three versions.
First version. On potential fields, he added a repulsive danger field to every future position of a searchlight. The main problem: if two projectiles fly at you, then the magician gets up in the middle and catches both.
The second version. I modeled a departure in 32 directions + to stand still until I’ve got my head up / put myself up / put up, took where I get less damage (for the dies, the damage was too high). The remaining directions served the function of walking along potential fields - and lived so happily until the end of the second round.
Main problem: does not evaluate not yet released projectors. More danger fields from magicians took into account the caste of hedge, but did not consider that if I am very fast, I can come closer and still dodge everything. I later used this version of dodging to assess whether to shoot at an enemy or not. If the damage on it before the addition of the "probabilistic" MM to the world is less than after, then I shot.
Then he began to add more projectors from Allied wizards, ready to shoot the next 5 ticks. This slightly increased the accuracy and cunning of shooting in groups. I fired not at the center, but at a point from which to run in all directions the same time:
public static V2d getShootPoint(V2d aimPos, V2d aimAngle, double projectileRadius) { return aimAngle.copy().mul((projectileRadius + 35.0d) / 7.0d).add(aimPos); }
Third version. In the collection of projectors added those that can be released by magicians in the next 50 ticks. For each direction, they generated their own, so that the magician would always shoot at the most inconvenient point.
Went around like this:
I added damage from “probabilistic” projectiles with exponential attenuation over the distance that the projectile would fly over, so that the magician would run away from the too close mages, but did not think: “What difference does it make that they fall on the border one pixel from the safe zone?” .
This is the most difficult and the most crutch part of the project, because you had to edit a lot, finish, remove extra fearfulness, check when you can turn for a shot, and when you need to dodge 100% (if I couldn’t dodge real project files, then I didn’t even rip ). Do crutches on the fact that the magician may slightly run up.
Here is the third version allowed to remove the field of danger of the wizard, come closer to it and still dodge the shots (even those that have not yet been made).
This is probably the basis of the algorithm. Of course, there were a lot of ticks, I fell in time with the third failures, a lot of errors and bugs in the constants, some bugs in the constants made me rise almost instantly 10-20 places higher (in the top 100).
Conventionally, I will divide the algorithm into two parts.
The first part of the algorithm is obstacle avoidance. The second part uses the first, but solves problems with cutting down trees and estimating the length of the path (the first was able to bypass, but could not estimate the distance to the point).
First:
In fact, this is not a search for a path - this is a quality bypass of obstacles. The algorithm is built around groups of related objects and tangents to them.
If you watch the video, you can see that there are a lot of purple lines between the trees. These lines are not present in the algorithm, but they were created to visualize groups. At the very beginning of the video and at the 40th second, where the magician goes into the forest, it is very clearly seen that we have two large groups in the “left” forests, the magician between them and passes at the 40th second.
We have a set of objects and a set of groups, which is initially empty.
We take an object from a variety of objects and check if it falls into one group (it is close to one of the group objects) - then we add it there. If more than one, then we merge these groups and add a new object to the new group.
Next, we determine where to move in the current tick. At the entrance we have a lot of groups, a mage and a point where we want to go. I would call this algorithm this way - throwing rays, although in reality, all the same, these are not rays, but segments: they have a maximum length.
First, we throw the beam to the point where we want to go. We find the intersection with the group (object from the group), which is closest to us. We pass through all the objects of the group, consider both internal tangents to each object of the group and find those that will give the maximum angle of deviation from the current beam - there will be two (one to the left, the other to the right).
Further, according to the magic logic (in the first variant of the algorithm, the function of choosing from two tangents was difficult, because choosing the one that gives the smallest deflection angle is very bad: when you come close to the edge of the group, this angle often becomes maximum) choose one of the two. The touch point (taking into account, however, the radius of the magician) will be our new point to go. After that, we again throw the beam, but at a new point, after removing the group with which we have already crossed.
The first algorithm had several flaws. I did not like the function, which tangent to take from two; I could not estimate the distance to the point; did not cut trees.
The second algorithm used the same function to bypass obstacles, but before that he considered an approximate path and removed trees from it.
First, I took the Lee algorithm (wave) and built a wave from myself on the whole map (the map is represented by a 125 Ă— 125 grid). Buildings and minions were considered impassable, but the trees simply had an overpriced passage price depending on their hp and distance from the center (the tree can fall on several cells at the same time, and it would be illogical to mark all these cells with the same weight).
Those who are familiar with the Lee algorithm will be indignant: this algorithm is able to find a way, but not take into account the cost of the transition, and they will be right. For this reason:
a) I considered the wave to the whole map;
b) the algorithm was slightly crossed with Dijkstra. In fact, it can even be called Dykstra, but not Lee, where the graph is represented as a 2d-array.
In general, it is difficult to say that this is Lee or Dijkstra, for the detour I used is from Dijkstra, but with optimization, and I kept the data as for Lee.
As a result, we have a map, we can find from any point the best way to the mage. Since we sometimes need to look at several paths for a tick, this also had an advantage over a *, which for each path needs to be recalculated again. And for full optimization, I also update the card itself every 60 ticks. If a cell changed over 60 ticks (which always happened several times), then I used a quick update: the previous cell overestimated the weight, and the new one underestimated it.
The next stage is to cross it with part one ... Everything is simple, take our path, cross it with a circle with a radius of 300 (I don’t remember why I chose this one, at first it was 600); the intersection point is the point to which we will consider obstacle avoidance. The solution helped to simplify the old bypass algorithm, since now it was possible to take the minimum angle of displacement from the direction vector: the vector itself was short and already indicated where to go.
The last stage is felling the forest. At the entrance there is a path represented as a segment of small length (built on the basis of a 125 Ă— 125 grid), and a multitude of trees. Task: to find trees that can be quickly cut down, which cannot be passed around and next to which (or through which) the path passes. The algorithm is simple, but there is a bug in it - sometimes it cuts too much. When I did it, I already understood all the meaninglessness of what was written and did not catch up with him to the ideal.
Algorithm: go along the path and find the tree that is closest to our current segment. We are convinced that this tree actually suits us (it cannot be bypassed). To do this, figuratively divide the playing field into two parts: on one side of the path and on the other. If there is such a tree that belongs to the other side of the path, and if it is impossible to pass between these two trees, then the current tree must be demolished.
There is a problem: in places where the path passes close to the middle of a tree, the tree ends up in the wrong half of the way. As a result, simply did not solve this problem, because it is not fatal.
Why does all this work? When a path is built, it passes in such a way that the radius of the tree affects the price non-linearly. Therefore, trees that are easily cut down are estimated as if there are almost none (very small price), and large trees are comparable to almost impassable points. And as the weight of the tree fades with time, in most cases the path goes between the trees, but closer to the smaller ones. Due to this, a small tree is chosen as a log house. Traversing now goes not globally, but locally (radius 300), and those trees that need to be cut down, I delete when searching for groups (so that the obstacle was calculated without them), so the first part of the algorithm shows itself even more: it does not need go around obstacles for miles, she passes them here and now.
Almost the final version of the movement can be viewed on the video:
But in real life is not so good. Sometimes the point where to run changes right in the middle of the forest, sometimes the creeps are chasing you and you have to choose: either to chop down the forest, or to beat the creep ... and many more minor factors.
→ Link to code
The path search is in the Algorithms / A_PathFinder files. Obstacle avoidance - in Algorithms / A_Move. Building groups is in Environment / E_World.
Various 2D mathematics - in Common / C_Math.
And a couple of words about the other side of the Olympiad - about time:
Total: 170 hours.
On the search path I killed: 49 hours.
To maintain its architecture in a more or less normal state (about 70 classes - 140 C ++ files): 28 hours.
Turns off: 26 hours.
Everything else: 67 hours - edits of bugs, adjusting coefficients and other more boring things (line selection, danger map, understanding when to attack and when to run away).
Concept: it was based on a set of states in which a strategy can move depending on the conditions and the current situation. Such states can be: movement to a bonus, attack, finishing with a tower, dodging from an attack, etc. Each state can affect other states — in most cases, override (cancel) the effect of the state. Since the code is actually linear, all the states are simply checked each time and those that have the conditions are activated. For example, if we are standing on the line and it is time to take a bonus, then in addition to the time, various conditions are checked: is the upcoming front coming, are we having advantages, are we within the zone where we can withdraw for the bonus, or bring the base.
Some mistakes: it is possible that because of this amount of properties, there was really not enough time for detailed implementation, so almost until the last moment the movement was very clumsy: the key points from the quick start strategy + the ability to move back, but then it was fixed.
Also, for some reason I did not think that damage could be inflated further than the distance of the caste, and much further. Almost before the start of the second round, many did not take into account this fact, and everything was OK, but then my stratum began to lose, and I began to notice that I could attack from a greater distance, and also somewhere on the forum it sounded too.
Movement: the main point is the recursive function, which allowed calculating to a depth of 6 ticks on a 7 Ă— 6 grid and a threshold of 25%. The essence is this: we go through all the steps to the maximum distance around us on the grid, seven steps along the long side (forward and back) and six along the short side (left and right) and evaluate this position: can we stand on it, and if so, do we impose different fines and bonuses - this is exactly the place where there is a use of PP. Next on the threshold we screen out this set of points according to the evaluation function and repeat the recursion for the remaining ones. The evaluation functions are linear depending on the distance, and the attack of the opponents, the cooldown before the attack, the radius of the trees were taken into account - and even an attempt was made to take the Allied minions into account so as not to get stuck. Also, the movements of all other units are also simulated for 6 ticks, but in the direction and with the same speeds that were recorded before the recursive procedure was started. The trees were also fine. If there were no enemy units around, the miscalculation was simplified to save time.
Some wrote above that recursion is a long process due to nesting and multiple distance calculations. Through the profiler, I saw that the hypot (x, y) function took all the time in this procedure and, having run the net, I found that it works much longer (about 40 times) than the simple sqrt (x x + y y). In this case, where the coordinates lie in the range 0 ... 5000, the hypot function is not needed at all (I suggest the reader to look for himself why hypot is better than sqrt). And in the end I decided to abandon the extraction of the root, where it is possible, and thus translated everything to the sum of squares. After all, even the distance between units can be estimated without calculating the root: the sum of squares is sufficient. Multiplications and additions are performed faster than root extraction; as a result, it became possible to count to 6–7 ticks with a sufficiently detailed grid (or tree), and at the last level there could be 3-6 points.
The only negative - I could be in a local minimum between the trees and often stuck. The reason is that the allowable distance to the trees themselves is too close, as well as a small number of ticks of the recursive function. I did not struggle with this, but decided to make sure that my magician would destroy trees that might prevent him in the future.
After changing the balance after the first round, I kept around the top 30-50. For 5 Ă— 5 battles, I also tried many tactics, but I stopped at five in the center, and this was both better and faster. But again - there was a lack of a more accurate assessment of the opponent. The penalties were too rough for the top 20.
Strangely enough, for some reason I decided that fire was not the best option, because if you do not give the enemy fire on the line and speed up the attack speed and speed of attack, then you become positive. I tried different tactics, and through speed + frieze, and through fire + range. All of them are in the code. For 5 Ă— 5, I wore one mage for a range + speed, the rest for fire.
Link to the code (for some reason, in this project and only in this I decided to write Russian comments; there is quite a lot of commerce, and I think you can figure it out very easily, wrote in C ++).
Now we are conducting a series of ML championships on our ML Boot Camp platform. In it there is a place for everyone: from a beginner to a professional. For beginners, we offer training articles on machine learning, and the pros - a serious test of strength by our task. Join us , the next championship in a couple of weeks!
The latest news about all the championships organized by the Mail.Ru Group can be found in the official telegram group
Source: https://habr.com/ru/post/325050/
All Articles