Creating a bot for participation in the AI mini cup 2018 based on a recurrent neural network (part 3)

Final part.

In the previous chapters ( part 1 , part 2 , part about GPU ) we touched on the conditions of the competition, the neural network, the genetic algorithm, so we continue.

But before continuing, a link to GitHub with the source code of the program in c ++ and to support the title of the article - an executable file under Windows OS in the bin folder, which is quite similar to screeen saver. In the basement of the article organized the "Hall of Fame" of past championships.

So, we stopped on the program architecture, which consists of two separate parts (programs), the part containing only the neural network and the communication protocol with the game server of the competition organizers (the gaming bot itself) and the main part consisting of the following blocks:

Briefly about each of the parts.

Physical engine. The basis is the physics calculation module from official sources, converted under the GPU, and calculations of the bot sensors, fitness functions, bot collisions are added. In the source, I removed the viruses and bot attempts to share, the division is an unstable part of the program. Therefore, I did not share the mistakes.

Neural network. We talked about it in some detail last time, including about the implementation in the code, so I will assume that much is also clear here, especially since there were no special questions from readers.

Genetic algorithm. There are gaps in the coverage of its implementation. Now most likely I will repeat, but it is easier to repeat once again.

From the slide presentation a lot came to mind. Therefore, the main question remains: how to move evolution? For this, the Fitness function was invented. The main task of the fitness function is the selection of genotypes for transmission from the current population of bots to the next.

How the choice of fitness function happens is most likely clear. Crossing issue remained.
There are several basic methods of crossing, more details on the link . But the basic principle is the random exchange of genes between the genotypes of the parents. There are usually two parents who are chosen; there are also several methods for choosing parents from the population; here you can read more about the link . Although the choice of parents is made randomly, but the probability of choosing a specific parent increases in proportion to its fitness function.

Next, the mutation function is applied to the finished genotype.
In our case, from time to time, the algorithm changes an arbitrary selected gene to a random gene, in other words, one of the weights of the neural network changes to a random one.

We received a new population and further evolution continues to the desired result. There are several points, the first: the more bots in the population, the faster the genetic algorithm will find the optimal solution or converge to the solution (the optimal ratio of weights in the neural network). For example, if a bot has 1000 genes, the search space expands significantly with a population size of 3,000 bots, compared to a population of 300 bots. But another problem arises, if you release all 3000 bots to the arena designed for 4-8 bots, then most likely the bots on it will be physically cramped and if they learn something, then definitely not playing Agario. Therefore, there are two main ways to avoid arena overpopulation: first, select several bots from the population alternately and hold their competition in the arena, and so many times until we accumulate game statistics for each bot. The second one, which the author went to, is launching in parallel several arenas, say 50-300, everything depends on the power of a specific computer and in parallel the entire bots population participate in contests. The population can be divided into subpopulations with their fitness functions and indifiers (corresponds to the number of arenas), and then exchange the genotypes between the subpopulations. Or imagine all as one large population of bots playing in different arenas. The author tried both options, but in the source version of the version with a single population of bots. The parameter in the Depth program is the arena number.

That told the basic principles of building a program. Who wants to see live welcome link on GitHub .

if you fail to start, then a short video will brighten up this moment:

Announcement: In August, mail.ru organizes another AI mini cup, the official announcement has not yet seen the truth. But according to the information from the group's telegrams, the physics engine is again the basis of the competition, something on the subject of machines. Therefore, briefly about the principles of creating a bot, those who will be interested of course:

Here, like our national football team, it is desirable to leave the group, the final is a separate song. Let the finalists write about victories, but we have the main participation.
The most understandable and simple:

Write in the body of the bot code more different conditions, for example if (fossa) -> jump, etc. Simple conditions are effective at the beginning of the championship, then the complexity of bots will grow and the advantage of conditional solutions will go away.

And so is the most promising method in many strategic games:

The method of PP ( or potential fields ). The idea is beautiful, searching for a high or low in a dynamic environment. The grid of the selected dimension is built on the whole playing field or only on the area around the bot, it all depends on the bot's scope. Further, at each nodal point of this grid, we consider the distances to the objects of interest to us or, alternatively, the potentials immediately, they can be positive (attraction, this is what interests the bot) and negative (danger to the bot). Accordingly, the potentials at the point are summed and we obtain the total potentials at each grid point. Potential field. The bot can only choose the lowest or greatest potential, depending on the task at the moment. Basically, the potential fields are 2D implementations, although 3D will look cool, but there will be a lot of resources for calculations.

The most difficult:

The two main implementations are the Brute Force method ( Brute force) or the Monte Carlo method . Each of them is a topic for a separate article, but according to the feeling, it is these methods that will lead you to the final. If tezisno, the bot can not only look at the current time or the past, but if you want, you can look into the future, here there is a funnel (cone) of possible outcomes and the further the bot wants to see the more options appear. For example, at the time point of Tik Zol, the bot decides to go in one of eight directions, in Tick + 1 step in each of the eight new coordinates it has the opportunity to go again in eight directions, etc. must take into account the possible movement of the enemy during this time. Every possible outcome of the calculation is charged with possible benefit or harm. Based on which the bot makes a move in one of the directions. And then such a calculation into the depth of the future time goes every tick. The depth of the views of the moves is determined by the possible resources of the computer. Therefore, when computer resources are low, the problem arises of optimizing these calculations.

About the source, if there is interest in editing the comments to the code, while laid out as it was.
In the source code, the input of the neuron is given by the signals of the current Tick and the previous Tick, it has become more interesting, thanks to the reader: tongohiti for the idea.

Those who remember the thesis on the initial distribution of weights in the neural network, an interesting topic is the Initialization of Xavire.

Thanks for attention. Meet me at the AI competitions.

Articles on the topic of the participants, but first lyrical digression :

She sat on the floor
And a pile of letters sorted out,
And, as the cooled ash,
I took them in my hands and threw them.

She took the familiar sheets
And looking at them wonderfully,
How souls are viewed from on high
On them the thrown body ...

Oh, how much life was here,
Irreversibly experienced!
Oh, how many sad minutes,
Love and joy killed! ..

My strategy for the Russian AI Cup 2017
History of participation (and almost victory) in the annual competition Russian AI Cup 2016
The history of victory at the annual competition Russian AI Cup 2015
Russian AI Cup 2014: winning strategy
Gold Medal at the Russian AI Cup 2013 - how it was
The path to the silver medal at the Russian AI Cup 2012

Source: https://habr.com/ru/post/418549/

All Articles

Creating a bot for participation in the AI ​​mini cup 2018 based on a recurrent neural network (part 3)

More articles:

Creating a bot for participation in the AI mini cup 2018 based on a recurrent neural network (part 3)