📜 ⬆️ ⬇️

Russian AI Cup 2017 - the story of second place

Hello! In this article I would like to tell you about my participation in the Russian AI Cup CodeWars game writing bots competition, in which I managed to take 2nd place, and how and what was done for this.




Announcement and first impressions


I must confess that the published announcement of the competition did not please me very much. The limit on the number of actions, the selection frame, automatic shooting units ... Everything looked so that there will be no search for ways, no aimed shooting, and certainly no microcontrol. But microcontrol - this is what so fascinated me in the games tops competition last year. I myself then did not achieve much success in this area, but I hoped to catch up this year. But to miss such an interesting competition, of course, it was impossible.
')
The positive moment of the early announcement was that other participants picked up and shared a set of feature articles on the development of strategies for RTS. Well, it was already possible to ponder the principles of managing groups of troops.

Beginning of work


So, initially we were available 5 types of troops, 100 units each: tanks - powerful, but slow units, BMP - more mobile units, suitable for the destruction of aircraft, BREM - repair vehicles, defenseless to enemy attacks, helicopters - powerful weapons against tanks, but extremely vulnerable to fighters and fighters - fast, but dangerous only for helicopters. Among the available actions were selection, deselection, addition to and exclusion from the group, as well as rotation and compression. Only 12 actions in 60 ticks were available for the strategy.

After the opening of the competition, for several days I could not begin to develop a strategy, since it was not very clear which side to approach it from.

In order to start somehow, I slightly modified the quick start (Smart guy) provided by the organizers. The starting strategy consisted of the following: once in 300 ticks for each type of technology, the center of the formation was obtained from the averaged coordinates of all units, one preferred type of target technology was selected, the group was allocated and received a direction vector towards the target. BREM went just to the center of the map.

I expanded the list of goals for each type of technology; in the absence of targets, added escape from a dangerous type formation. This first version looked very sad. The groups stretched, collided with each other, got stuck, parts of the groups broke off and went somewhere to the horizon. Enemy groups were also sometimes torn apart and my units tried to attack the empty space between them. However, one had only to try to start improving the strategy, as new ideas of its development began to appear.

By this time, tornado-like strategies were popular in the sandbox - all types of equipment were attached to one center and twisted for compaction. And then a more sophisticated technique appeared, which the participants dubbed the “sandwich” - the troops were already mixed not anyhow, but in alternating, even rows. As a result, a very powerful system was obtained, since the dissimilar equipment covered each other, and the BREM eliminated damage. Such "sandwiches" in a few tics absorbed the "tornado", not to mention my stupid homogeneous detachments. Such a turn made me very sad. It seemed to be useless to fight with “sandwiches” by separate units, but I didn’t want to play “tag” with the starting random arrangement of the units and hardcode the formation of a “sandwich”. As a result, I again abandoned my strategy for a couple of days.

And here came the good news - since the organizers also did not like the dominance of “sandwiches”, they added the possibility of a nuclear strike, which could hit a large number of units at once, while being at a relatively safe distance. This inspired me, and the “beginnings” of the sandwich were relieved to be removed. The irony, however, is that the tactical nuclear strike ultimately turned out to be in the hands of the “sandwiches” rather than their opponents, but this was not immediately apparent.

Since I planned to manage groups of troops, it was necessary to detect them qualitatively, for which clustering was used. DBSCAN seemed to me the most suitable, since it recognizes an arbitrary number of clusters of any shape. The division into groups was carried out for each kind of equipment separately as follows:


Immediately, I note that I have never used the purpose of groups, only the selection frame.

Each calculated tick of the group manager compared the groups obtained after clustering with the groups obtained in the previous steps and based on the position and at least partial coincidence of the equipment in the groups, established a correspondence between the old and the new version, which made it possible to track the group's history.


Clusters painted in different colors

In my opinion, the management of groups turned out quite successful. Groups could break or unite as much as they wanted - new formations immediately received control without any additional effort on my part. After a while, even collisions of disparate groups were successfully resolved on their own.

Now it was possible to fasten the group to the starting strategy. Initially, the control equipment looked like this:

delayedMoves.add(move -> { move.setAction(ActionType.CLEAR_AND_SELECT); move.setRight(world.getWidth()); move.setBottom(world.getHeight()); move.setVehicleType(vehicleType); }); delayedMoves.add(move -> { move.setAction(ActionType.MOVE); move.setX(targetX - x); move.setY(targetY - y); }); 

It was not very convenient, so I had to develop a system for working with actions. Creating a team on the move I got this :

 Action.moveTo(group, targetX, targetY); 

where the Action object contained grouped actions, which at that time were exactly two cases for all cases (selection and the action itself). In this case, the usual coordinates on the map instead of the motion vector were used as parameters.

ActionManager worked with actions. At an early stage, it represented a queue with a priority, where each tick, with an accessible action, placed commands for each of the groups. The manager deleted the teams of the disappeared groups, replaced the old goals in the queue for more relevant ones. Also for each group, the last performed action was stored and, if the newly arriving action had a similar direction angle and a close target with the old action, the new action was not added to the queue. This was supposed to help save the available moves. This manager came out pretty primitive; he could not perform complex chains of commands or deferred actions. Therefore, in the future, more complex actions had to be processed at a higher level.

So, I had 1100 lines of code, clustering, groups, convenient management, managers and ... the logic of the smart guy and a complete misunderstanding of what to do with all this.

Potential Fields


Initially, I assumed that I would define the actions of my groups as influence vectors — for example, to determine the final direction of movement, it would be necessary to add the vector to the group that is planned to be destroyed and the repulsion vectors from their groups and dangerous enemy groups. But it seemed to me too complicated and I decided to try fashionable potential fields (PP).

To calculate the charges of potential fields, it was necessary to understand which groups are dangerous and which ones are worth the attack. For example, five helicopters should be afraid of three fighters, but at the same time 25 helicopters will destroy one fighter, receiving minimal damage. The outcome of the battle could have been simulated, but I wanted to start off with a simple calculation. Having written up several sheets and having had time to despair, I finally decided to revise the literature read earlier.

The necessary information was found in this dissertation . In this paper, among other things, various models of predicting outcomes of battles in RTS are considered. In the authors' experiment, the Target-Selection Lanchester's Square Law Model showed the best results, giving 94.45% ± 0.34 correct predictions. I was quite satisfied with this result, besides, the calculations approximately coincided with my “paper” simulation. Compared to the actual game situation, the model was a simplification, since it did not take into account the formation density, firing range, cooldowns, etc., but I decided that at the initial stage this can be ignored. In fact, this calculation remained so basic and unchanged until the end of the competition. I think that this successful find is one of the key elements of my strategy.

Information about the potential fields in the network is not very much, even less was read by me, so I got some kind of execution of the fields, which I will try to tell in more detail.

The articles about potential fields usually depict fields calculated for the entire playing field. Similar screenshots were shown by some competitors. But, since there were practically no static objects on the map (I did not take into account the weather and terrain), which could have been calculated in advance, and the situation on the map was quite dynamic, I felt that it makes sense to count the field for movement in the very near future, that is, in the immediate vicinity of the calculation group. Calculation of the impact on the entire field was used only for visualization during debugging.


the estimated field of the fighters attracted by enemy helicopters on the right (red is stronger than yellow)

To calculate the best direction of movement of the group, a 16x16 grid was used. How close is the calculation of influence? On the one hand, in order to save actions, we wanted to select the farthest destination point possible, but on the other hand, if you select the cell with the greatest potential at a relatively long distance, then the probability that the path to the best cell will pass through the cells that are harmful to us . I decided to calculate the values ​​of charges on three cells in each direction from the center of the group. The charge for the enemy group was set by the estimated difference of the killed units, so the group, in the battle with which the losses of my troops were calculated more than the enemy, generated a negative charge.
The decrease of charge was considered according to the law:

charge=scoreMath.pow(0.8,Math.sqrt(dist));


where score is the calculated difference of points; dist - distance in the grid dimension. At the same time, a positive charge influenced any distance, and a negative charge had an effect limited by the attack distance and the group radius. Negative limited influence fields also generated map edges and their own groups to avoid collisions. Initially, the values ​​of charges from all sources were summed.

This version had a lot of flaws - groups stuck together (because of what I refused to divide groups into 4 parts, as I thought temporarily), smeared on the walls (borders of the game world) or vice versa so afraid of the walls that they could not finish off single enemy units. But, finally, 4 days before the start of the first round, I had a strategy that consistently won the Smart guy!

Nuclear strike


For the time remaining before the round, it was necessary to add a bomb, which looked quite effective and effective weapon.

The strike was originally made quite simple: the enemy groups were sorted by size (for this, we had to do another clustering, without distinction of types), and for each group, the size of which was not less than a quarter of the largest group, my group was searched that could strike at the center of the enemy group so that only the minimum number of my units fall into the strike zone. The strike group stopped and was controlled until the moment the bomb hit, so that the group did not receive any other commands and remained motionless.

To evade the enemy strike, all the units falling into the strike zone were allocated and SCALE was made for them, that is, all units were directed away from the center of the strike. Since this action was performed outside the groups and it was not very clear which groups would be affected, and how the expansion would affect their composition, no other actions were performed for anyone during the evasion time. The duration of the "raspryg" was calculated taking into account the cooldown and after the strike was carried out with the same duration.

It didn’t work perfectly because quite often the cooldown was interfering with actions, as well as numerous bugs that periodically had to be corrected almost until the end of the competition.

40 minutes before the start of the first round, I discovered an unpleasant bug: by the time when the queue approached the selection action, the saved coordinates of the group were already irrelevant, which led to breaks in the group. I had to urgently correct and in a panic send the last package 3 minutes before the start of the round.

Despite the fact that at the beginning of round 1, the bot showed a steady growth in the sandbox, I was not very sure about it and seriously feared that I would not get through to the second round. However, in the round, the bot performed well, receiving 92.9% of victories and taking 35th place. A random role was played here by the random, since there were almost no top opponents, but still I was convinced that I was moving in the right direction.

gif is a typical first round game

groups are attacking the "tornado"

Preparation for round 2


Due to the difference in speeds on the terrain, the groups stretched and broke when moving. To fix this, I, like many, added periodic compression . She also corrected unpleasant behavior - a single unit that had broken away from the group began to push it off strongly, knocking her off the right path. Having decided that since it is of little use from single units, it is better for groups of one type instead of repulsion to make pulling for the case if one of the groups is less than 10 units or two times less than the other. The charge was equal to the size of the larger group. Now rasterska quickly glued to the main group.

The calculation of charges from the centers of the groups did not work very well, since the groups could be of very different forms, but I wanted to get more accuracy in order to better look for weak spots in the “tornado” and “sandwiches”. In order to bring the field to the detachment form, the group was further divided into 16x16 cells . Now the charge was calculated not from the center, but from the nearest cell of the group.

I also tried the idea of ​​generating charges from each cell of an enemy combined group, calculating the result of the battle relative to the part of the units of my group that are in a commensurate cell, but the solution turned out to be completely unviable and I never returned to it.

I noticed that sometimes in the fight against “sandwiches” groups assess the situation too optimistically. For example, helicopters attracted light and rich booty in the form of BREM and tanks, which outweighed the negative impact of the BMPs implicated there. As a result, quite often my groups fearlessly rushed straight to the center of the "sandwich", significantly increasing the opponent's score. I suppose that such problems are usually solved by a balance of coefficients, but I made it easier: if a cell experienced only positive influences from enemy groups, then only the maximum of them was taken into account, and in the presence of negative influence, the minimum was always taken. As a result, my groups ceased to “burn”, but this did not help much, as they now circled around a dense “sandwich”, not finding weak points in it while he methodically bombed my troops. Unlike the “sandwich”, they did not have the opportunity to be treated, as my BREM did not participate in the battles, but usually stayed in the corner. The “sandwich” bombs didn’t do much harm, because, firstly, he could be cured, and secondly, my blow always hit the center of the group, so that in a couple of blows he burned a hole in it and continued to beat point without causing damage to the enemy. We had to urgently refine the impact to find the point with the maximum damage, again taking into account the allocation of units in the group of cells. But the fight against well-made "sandwiches" is not much help. I tried to keep the troops farther to protect from bombs, tried to bring closer, for more aggressive attacks, but still the players with dense formations such as azt-yur and Savidiy mostly won. In the end, having seen that even having already become a leader by that time, GreenTea sometimes loses games without buildings to dense formations, I decided to give up the fight against “sandwiches” and start playing games according to the rules of the second round.


Picture of calculated potentials for a group of tanks (top left). Negative fields (blue) overlap positive (red)

To capture buildings, I simply added to the calculation of the fields the attraction to the buildings.It immediately became clear that the charges tied to the points themselves do not justify, since it is much more profitable to get 100 points just by standing a little on the building than chasing the whole map with the enemy group, risking losing your troops. Therefore, the factory was assigned a charge of 250, and the control center - 150 points. Also, to make buildings even more attractive, the field attenuation was made softer - points * (Math.pow (0.9, dist)). The distribution of buildings between the groups was done in the most primitive way — if within a certain radius of the building there was a friendly group closer to the target, then this building was excluded from the goals of the current group. But already such a simple capture of buildings, even without construction, gave an excellent advantage on points in games with “sandwiches” that were slow and cumbersome and could not quickly capture buildings.

Gradually, I refused to add charges from different sources. Thanks to this, it was easy to solve such problems as collisions of groups, the repulsive fields of which were absorbed by the strong field of the factory, the attack of dangerous groups under an attempt to seize the factories; hangs in the center of nothing, where fields simply “successfully” formed, etc. Instead of the sum of charges, I chose the largest (or the smallest, if the cell experienced the effect of a negative charge).

For the first experiments on the construction of new equipment, I launched the construction of BMPs in groups of 50 in all factories. The results of the first tests impressed me so much that I immediately put this defect in the sandbox. Remember, I wrote about the attraction to her? So, thanks to him, the groups began to merge, and the more the group became, the more it attracted. As a result, I grew up some hellish crab from the BMP, destroying everything in its path.



But of course I didn’t need giant crabs, but I needed mobile groups, so I still cut the pull “for my own”. Construction, too, should have been more reasonable, taking into account the situation on the field.

The construction was implemented by the following logic:
At the beginning of each turn, a " list of orders was formed "- types of equipment that were required to build. For example:


Further, each of the factories receiving this list of orders " decided " whether it was ready to accept the order, checking the completeness of the current construction and the presence of hazards around. The construction manager also monitored groups under construction and marked them so that they did not try to leave the factory before the group reached a given size.

With these changes, the bot was already confidently holding up in the top 10, so it was even possible to take a break a couple of days before the round. In the second round, the bot showed a good result, without losing a single battle out of 59.

Preparation for the final and final


In the games of the final type, a new condition was added - the fog of war. What to do with it was not very clear, so I focused on correcting long-known problems. For example, I wanted to make evasion from a bomb, as well as a nuclear strike, more effective, independent of the limit of actions. For this, I did a calculation of the used actions and, if for 60 ticks before a possible strike the limit of actions was less than the required amount for the current action plus the action on the nuclear strike (evasion of the strike), the action was postponed.

After this refinement, the bot began to time out. My clustering was performed only when there were available moves, and now the moves were available much more often. I hardly managed to achieve acceleration with small optimizations, and less often I did not want to do clustering (to tell the truth, I wanted to do it more often). The solution was very simple. Since groups are formed and disbanded not so often, I left clustering once every 5 ticks, and replaced it at other time with a simple update of the transport list.

In the process of these improvements, I noticed that the actions of the groups are performed more often than the idea should have been. It turned out that saving action did not work, probably, since the addition of PP. After corrections, tanks began to receive actions every 250 ticks and more. At the same time to save the calculated field PP was increased from 3 to 4 cells.

Only at the finish line did I get to work with aviation. All this time, my helicopters died at the very beginning of the battle, which was very bad, since the saved helicopters could sometimes decide the outcome of the whole game. To save the helicopters, a strong charge was added to the field , located at a point behind the BMP group closest to the helicopters. Thanks to this, the helicopters successfully hid behind the infantry fighting vehicles from the enemy, "running" around them. In addition to this, didso that the priority target for helicopters would be tanks (and not BREM), and for fighters - helicopters (and not fighters). Since at the start of the games in the fog the aircraft had no strike targets, the fighters were sent to accompany the tanks, and the helicopters - the BMP.

gif - helicopters hiding from fighters behind a group of infantry fighting vehicles

— ; — ;

. , , - ; , . , . , .

— , , . - 12 7. , .

Last but not least, before the finals, I added the counting of enemy technology, information about which was now unavailable due to fog.

Taking into account the new units was quite simple - it was necessary only to monitor the type and progress of construction in the factory. But taking into account the dead units was much more difficult. The problem was that both for the dead and for the units that went into the fog came exactly the same information with durability = 0. In order to understand whether a unit could die, I checked whether there were damage groups capable of causing damage within a certain radius, and were there units in these groups located in the approximate impact radius. Also, if a nuclear strike was performed on the last tick, it was checked whether the potential dead person was in his radius. As a result, a fairly good record was released, for example, in one of the games on the 10,000 tick, the calculated and real numbers were:

ARRV: 20 FIGHTER: 32 HELICOPTER: 3 IFV: 346 TANK: 99
ARRV: 19 FIGHTER: 32 HELICOPTER: 1 IFV: 327 TANK: 99

Some of the discrepancy is probably due to the fact that some of the units went into the fog and died from the bomb already there. Thanks to the calculation, it was possible to use all the adaptive construction settings made earlier.

On the eve of the finals, the bot entered the top three in the sandbox. In this case, strongly lost GreenTea , Leos , mixei4 , unstable played with tyamgin , Adler and Milanin. Therefore, although it is obvious that the bot claimed the place at 10, I did not expect much more. However, after the first half of the final, the bot took 4th place, so there was some incentive to improve it during the break. I doubted for some time, because with equality of points, the one who sent the strategy earlier has an advantage, and the bot points on the ground 2-6 were very close. But, since obvious errors and nonoptimalities were found when watching games, it was decided to make changes.

In addition to fixing minor bugs, the following was done:


I also really wanted to make the capture of factories more optimal, since in some games I obviously lost the advantage. But I tried to do it already in deep night and I could not get a clear decision.

Perhaps thanks to the corrections made, and maybe the games were more successful for me, but in the second part of the final my bot, albeit with a small margin, came in second place.

Despite the large amount of time spent writing the bot, not all of our plans were implemented in the code. So, I would like to make the seizure of factories more reasonable, to be able to attack two groups at the same target at a time, guess the location of troops hidden in the fog, etc.

Thank you for reading to the end. As you can see, the history of my bot shows that even relatively simple approaches can be quite effective. See you at the Russian AI Cup in 2018! Come, it will be interesting!

Source: https://habr.com/ru/post/345858/


All Articles