AI Challenge: Ants AI Challenge: liven up the "ants"

In this post I will tell you how to write a pretty good bot for Google AI Challenge . It is noteworthy that complex technologies associated with AI will not be needed, and the basic implementation fits into a thousand lines of C ++ code. The methods themselves can be considered as a Generic algorithm, and on the basis of them you can build a bot that takes into account some strategic features that may be even better to play. In any case, a good “quick start” for those who haven’t succeeded yet.

1. Some experience of participation in the AI Challenge

Judging by the comments to other topics about AI Challenge, many have questions regarding the "local" rating system. Actually comments.

1. The rating of a bot is evaluated relative to its success in games against other bots, i.e. the Skill figure is a relative value, and after each download the version on the server is reset to zero (since it is not known whether the new version will be better than the previous one).
2. Each successful game adds to Skill a certain amount, depending on the place (remember, the goal of the game is not to collect food, to grow a huge colony, but to capture enemy anthills) and from the skill of other players - and the larger the game number, the smaller this increment; Thus, casual loss (at first) to a weak player will slow down the increase in Skill for a long time.
3. For approximate stabilization of the rating in the current situation it takes about 5-7 days (“matches” are held quite rarely).

It follows from the above that it is already too late to try to “iteratively” adjust the algorithm to other players, therefore, it is highly desirable to test the bot locally, and especially not to waste time on the Starter Package.
')
The time limit for the execution of the algorithm is one turn: 500 ms, and can vary with the turntime parameter. However, it is highly desirable to complete the execution of the algorithm faster than this threshold by at least 50 ms, otherwise there is a high probability of getting a timeout due to input / output buffering (all game data transfer is done through pipes), and sometimes funny things happen:

The memory limit is 2GB (32bit platform limit), guaranteed (rather, it reads as “promised”) of the order of 1.5GB of memory, which will not be spread over the swap file. This is a fairly large number (for cards 200 per 200 cells), which directly prompts caching intermediate results, which we will do to get a good algorithm for finding paths to objects.

2. Algorithm for finding the path

It is difficult to argue about how good “this or that” strategy would be, while the ants do not know how to walk “on their own” through the maze. According to experience, a qualitative implementation of finding a path is always better than a good strategy, if using “exclusive OR”; better of course "both that and that".

Suppose that we already know where to go (we will discuss this a little later), and what ant should do it. It remains to calculate the initial direction of motion (N, E, S, W) and command the system. For this, it is necessary to calculate the shortest path to the target object (of course, we can cover the entire map with “random movements”, in theory, but the enemy will have destroyed us for a long time, in practice).

If the map did not contain impassable areas of "water" (dark blue cells), then the task would be trivial; Consider the red line connecting the "ant" and "water." Because the ant can walk only one cell in four directions, then the distance to the object (without taking into account obstacles) coincides with the Manhattan distance . The path to the object will always occupy the same number of cells if we use the increments in the right direction. Subtract from the coordinates of the target (x1, y1) coordinates of the ant (x0, y0) and get the number of steps (| x1-x0 |) with an increment of X: sign (x1-x0) and the number (| y1-y0 |) of Y: sign (y1-y0) leading to the goal.

Here is a clear illustration of the fact that the length of the path in any case will be the same (the number of cells in the path for all three lines is the same). Thus, we write down such heuristics as “useful” and first of all do not disdain to check whether it is possible to reach the target object by moving with the correct increment (you can call this function getSimplePath () , for example). What for? Sometimes it saves time (and quite substantial). But, of course, it will not cope with situations where water cells will lie on the path, or even more so, the initial direction of motion does not correspond to the final one due to obstacles (the case with a greenish line in the first picture).

How to be with obstacles? Here the wave algorithm will help us: we take a matrix the size of the original field, fix the target cell (writing 0 there) and recursively fill the matrix with values in accordance with the image on the right (we go around only those cells that do not (or have not been found) "water" ). The meaning of these values is in the number of moves needed to reach the target cell. That's right, as soon as we reach recursively to the cage in which our ant is located, we will be able to find a way to the goal - moving in cages for which the values of the number of steps decrease. The algorithm works in O time (cols * rows).

Is it a lot or a little? In our case | cols |, | rows | <= 200, all you need to build routes for up to 1000 ants (in real games they are never more), and each can have up to hundreds of potential targets. Total 200 ^ 2 * 1000 * 100, it is 4 billion iterations, mostly records in memory. Common sense dictates that this will not work in 500ms.

Why, then, is such an algorithm chosen at all? To begin with, it should be noted that theoretically, in fewer actions without any intermediate information, there is no way and no paving, at least you can construct an example costing O (cols * rows) steps (“spiral” into the whole field). Therefore, our task is to construct a method that works well while taking into account the presence of intermediate information (the state of the field changes little from transition from one move to another). In addition, there is no point in worrying that in the first moves he works slowly, for we have ants and everything is nothing.

What will we memorize? The whole matrix of steps of the wave algorithm for each target cell:

char * WaveMaps[MAX_X][MAX_Y]; // WaveMaps[x][y] = new char [MAX_X*MAX_Y]; //

I recall that WaveMaps [x] [y] are considered for the target object with coordinates (x, y); in fact, these are all possible routes to this object. Thus, it is clear that the targets will not be cells with water (there is no sense in them, and you simply cannot go), and most of the cells that are not filled with anything (food, ants, anthills) are not interesting to us - they will not be initialized.

Thus, in one move, we can calculate the paths from each ant to a given point, having determined which of them will reach faster (for the first time we count the map, then look in the cache).

What will happen next? During the game, the ants wander around the field, “discovering” new cells - initially we are aware of filling only a small part of the maze, but as we move we find food, water, enemy ants, anthills. In this case, we will be interested in what to do with the cards when a new “water” cell is detected.

To begin, find out how to detect the "water". To do this, it is enough to keep a local copy of the field status, and on each new move, check whether the received game state differs from the saved one. Well, found the water in position (x1, y1). Then you need to recalculate all the cards in which the value at position (x1, y1) is filled with the step number (and not the default value).

Taking into account the fact that all the maps are available in the tools package, we can estimate what percentage of the re-initialization (re-calculation) of the matrices for the wave algorithm we should expect. The result: 100%. It’s not great at all, but all because we built maps covering the whole playing field, but this makes no sense, because all the routes can be divided into “long” and “short”. The short ones require precise determination of the cells of the route, and the long ones are made up of blocks of short ones.

Thus, a reasonable idea would be to keep a hierarchical map of the reachability of objects. In the case of my implementation, only two levels were enough (100ms for planning all routes for all ants in the worst case scenario).

Then the above algorithm is modified as follows:
1. When each wave algorithm map is filled, we check if the distance from the initial point to the current point is not greater than MAP_RANGE (if we get more, we exit);
2. Build a complete wave map for a low-resolution field (MAX_X / SCALE, MAX_Y / SCALE). Such a map will help to find the way between points with a distance greater than MAP_RANGE.

When requesting paths for fairly distant points (x1, y1) and (x0, y0), we first look at the low resolution map, get an intermediate point (x2, y2) and combine the paths (x1, y1) -> (x2, y2) and ( x2, y2) -> (x0, y0), recursion, aha.

The entered parameter MAP_RANGE on the one hand determines how often the wave algorithm maps will be completely recalculated, on the other - the final accuracy of very long paths. With MAP_RANGE> 2 * viewRadius, there are practically no errors, and time (routes for all ants) hardly goes beyond 100ms.

Let me remind you that this trick with MAP_RANGE and a low-resolution map was needed in order to successfully use results caching, updating the cache as rarely as possible (when the cell is examined within the MAP_RANGE radius, its update will stop altogether).

As for the memory: in the worst case, the fill will be: 200 ^ 2 * (200/2) ^ 2 (full cards) + (200 / SCALE) ^ 4 (low resolution) bytes, which does not exceed the 500 MB threshold with a reasonable SCALE value (more than 5).

3. Goals

A good way to win is not to make meaningless moves, and not to miss meaningful opportunities. Let's try to analyze what it means.

The goals may be:
1. Enemy anthills
2. Food (only if you have your own anthill)
3. Their anthills (for protection)
4. Own ants (to protect or enhance an attack) - support
5. Enemy ants
6. Uninvestigated cells

The game has no more objects, respectively, and there are no other goals.

I wrote out the goals in order of their priority (let's call them programList ), from the most important to the least important (but still necessary). Capturing anthills is the point of the game, therefore, all other things being equal, it is better to try to capture an anthill (except for some reservations which follow further) than to chase after food. I would not like to lose their anthills, and their ants. If you can just "kill" (without loss from your side) enemy ants - it is better to do it. Well, if absolutely nothing to do, you can wander around the field.

In order to find the target objects, it is necessary to bypass all the received field by checking whether they satisfy the criteria of the target:

foreach int program in |ProgramList| foreach Location loc in map if (MatchingProgram(loc, program)) ProgramList[program].push_back(loc);

Where the MatchingProgram determines if a given cell is suitable for a program:
1. if the program == enemy anthills, then the cell must contain an enemy anthill;
2. food, the cell contains food;
3. its anthills, the cell contains its anthill and there is an indirect threat to it (the presence of an enemy ant is closer than ours);
4. its ants, the cell contains its ant and there is an indirect threat to it (the presence of enemy ants in viewRadius);
5. the presence of an enemy ant, and its enemies are more or less than “allies”;
6. unexplored cells (cells in which we have not yet been, more on this below).

Next, you need to assign each goal of the ant, which will deal with its implementation. For each goal we review the list of ants; choose the closest one for which the previous target (set on the last move) is worse than the current one (ie, goes under the big number in the list), or not worse than && the distance to the previous target is greater (with a margin of 1 step). This, on the one hand, eliminates the ant jerking back and forth between equivalent goals (a frequent error among the participants), on the other hand, it allows using updated information (in search of food, got to the enemy anthill).

4. Locks

As you know, two or more ants that fall on one cell die. We want to prevent the senseless loss of "composition".

The trivial way would be to check if the cell is not busy assigning another ant, so most of the simplest bots work. But in this case, blocking is not excluded, when ants prevent each other from reaching their goals, and, as it often happens, the basement is senselessly crowded and the situation is not excluded when some ants have no progress at all.

Therefore, it is necessary to separate those ants that have already moved during the course, and those that have not moved yet. When you try to stand on an already occupied cell, two options arise:
1. Choose a different direction for the current ant;
2. Move the one who interferes ( blocker ) to another cell.

programAnt(MyAnt *ant): if (ant->alreadyMoved) return ; repeat: direction = getDirectionByProgramList(program) if (moveResult(ant, direction) == mrBlockerNotMoved) { programAnt(blocker); program++; goto repeat; }

It is important that the “interfering” ants move not anyhow as, but in accordance with the best possible program. Thus, the function of the “programming” of the ant as a result can shift even several neighboring ants to the challenge — all that is needed is to avoid mutual recursion with the same parameters.

Another reason for the useless death of an ant is an attempt to wander in an area with more than the number of opponents. To detect such situations will have to work hard and write a battleResolution.

This is not a problem, however, if you use the C ++ starter package, you will have to use a state.grid, and not an enemyAnts, because in the list they are not distinguishable by type (if this is not corrected already, of course). Anyway ...

5. Why the starter package is not so good.

1. Ants received in myAnts do not store their parameters. I have already mentioned that it is a good idea to keep his latest program and path for each ant, however, using State.h, this is not done - the myAnts vector is created on each turn. Therefore, the correct way to store the parameters of ants is to create a map of the form MyAnt * antMap [MAX_X] [MAX_Y], which you will update manually (this is not so difficult to do in the function makeMove for a specific ant).

2. The updateVisibility function works unacceptably long. For a thousand ants, perhaps even longer than laying paths, because for each ant a lot of cells are checked for inequality, which determines the distance. It would be much faster (which I did) to save for each cell a list of directly visible (viewRadius) and attacked (attackRadius), initialized when the cell was first examined. Then updateVisibility can mark the cells visible for each ant using the ready list, but not by enumeration.

3. std :: vector <std :: vector> grid. Here we lose both in speed and in some cases we increase memory usage even against the Square grid [MAX_X] [MAX_Y]. The loss of speed is not a joke, since the object is very poorly cached, being scattered in different memory addresses.

6. Research and control of territories

Since we now know how to reach a specific object, how to implicitly prevent the loss of ants and implicitly organize attacks on enemy ants (except for obvious attempts to collect food, seize an anthill), it remains to learn how to master the territory.

The study involves the detection of static objects in those places where we have not been (anthills, water).

Control - check if something useful (food) or harmful (enemy ants) has appeared.

Both that and another, an essence one and the same - cells in which we for a long time were not. And this concept is fairly easily formalized.

After we assign the goals (programs one through five) of ants, there are a number of ants who still do not know where to go, they should be sent for a walk. So they should go to those cells that no one sees (visibility == 0). Of those “invisible” cells, priority is given to those that were never visible (lastSeen == NEVER) - research. Of the rest - those that were visible on the earliest turns (max lastSeen) - control. Among equal in the lastSeen parameter, cells should be chosen nearest.

Such an algorithm will quickly bypass the entire field, and at the same time will not allow for a long time to lose sight of the territory where food will appear.

7. Reason to ponder

Now our ants are quite viable and know what to do, however, the above algorithm does not guarantee victory.

Firstly, because it is impossible to construct always winning algorithms in the game. A simple example of a possible situation - on the first moves, all opponents attack you, from different sides, and so as not to attack each other. No arrangement of even a large number of ants around the base will allow it to be protected.

Secondly, the order of goals, which I defined by myself is correct (let's say, makes sense), but does not take into account the strategic features of the opponents. In some cases, quickly enough to reach the enemy anthill and capture it (because it is not guarded), and sometimes it is the right way to suicide. The priority of the objectives will depend on the level of the enemy. With a weak opponent it is even useful to “exchange” ants, and from strong ones it is sometimes better to “run” and wait for the moment of almost complete “exchange” of other players. On the first moves, we don’t know anything about opponents at all, and to live to the middle of the game with equal levels is luck (well, or lack of “bad luck”). Thus, the very priority of goals implicitly contains the concept of strategy (for example, I did not say anything about a group attack, but it is done thanks to goal 4 - support), on which you can work hard.

I hope the foregoing will help someone to solve some issues, or at least show that, in general, the subject matter of the competition is not that complicated. Good luck!

Source: https://habr.com/ru/post/132776/

All Articles