📜 ⬆️ ⬇️

Genetic algorithm on the example of the Robocode bot



When this article was written, a habra search by the phrase “Genetic Algorithm” produced a noble void. However, the insufficient level * cut out by censorship * postponed the date of publication, and only now after this shameful tedious begging on my part, this article got the opportunity to show itself to the world. During this period of time, at least three (so many I came across) articles on a similar topic managed to be published, and, quite likely, you will read something from the text below not for the first time. To such people, I suggest not frowning the noses from another attempt of an inexperienced youth to explain the GA scientifically popularly, but to go to the next exhibit to the second part, which describes the creation of a bot based on the GA for the Robocode programmer game. This, according to the latest intelligence, has not yet met on Habré.

Part one. The life and work of the genetic algorithm.


Let's start from afar. There is a set of tasks that need to be addressed. Our goal is to find actions that can transform a given (initial task conditions) into an answer (target state).
')
If the situation is simple, and the solution of such a problem can be clearly calculated from the conditions with the help of these your matans, then it's nice, here and without our wisdom everything is fine, we are fucked, we all diverge . For example, when solving a quadratic equation, the answer (the values ​​of x1, x2) is obtained from the initial condition (coefficients a, b, c) by applying the formula we all learned in school. And what to do in the sad case when there is no necessary formula in the textbook? You can try using brainstorming to solve one of the problems. Analytically. Numerical methods. The power of desperate busting functions. After a while, a dreamy student "will be heard" if only it is solved itself. Yeah, that’s where we get out from behind the curtains. So, the goal is to write a program that would find a function (program) that receives input data at the input and returns good digits. The power of metaprogramming to fight!



Hmm, how are we going to achieve such a goal? Bring a sacrifice to the gods of recursion around the campfire: we will write a program that will write a program that would find the function (program) ... No, this is not the second time. Better we take the example from nature, casting our eyes on such phenomena as the mechanism of evolution, natural selection. Everything is like in life: our programs will live, mate, give birth and die under the yoke of more adapted individuals, transferring their best qualities to their descendants. It sounds crazy, but it's worth a closer look.

The god of our world of programs is our task. Programs must believe in her, mate for her, put in her honor candles in the church and live with the sole purpose of finding the meaning of life the solution to this problem. Most adapted to the environment (approaching the solution of the problem) becomes an alpha male, survives and gives a strong offspring. A loser who has spent his whole life playing online games has not known success in solving a problem, has very little chance to give offspring. The gene pool will be cleared of the contribution of these pimply comrades, and the entire program society will move towards a bright future for the problem solved. Well, in general, it is already clear, now we need to deal with the nuances: first, how do you imagine the pairing of programs for yourself? secondly, from where we will take the first generation of programs? thirdly, on what grounds will we determine the fitness of individuals and how will it affect the crossing? Fourthly, it is necessary to determine the conditions of the end of the algorithm, when to stop all this orgy.

The art of mating programs


I think many of us sometimes have a burning desire to apply violent sexual acts to programs. Here we have to warn in advance that we have such interspecific deviations are not encouraged. We have everything as the Catholic Church bequeathed: the program with the program, only after marriage ... and partners do not change, even if that languid guy bought you a cocktail at the bar. Although no, I'm lying, harem type polygamy is flourishing. Yes, and yet, despite the use below of such words as “father” or “son,” our programs are hermaphrodite. Well, incest too ... Ugh, and I also spoke about the church * facepalm *. Okay, more on that later.

The question of crossing programs is not so simple. Randomly exchanging functions, strings or variables will lead to a bold stream of scary words addressed to you from the compiler / interpreter, and not a new program. That is, you need to find a way to cross the program correctly . Smart uncles found a way out. And the smart boys and girls who studied the structures of compilers, too, have already guessed. Yes, yes, this is a syntax tree .

Immediately I will die away: our beard is not very thick, so we will use the simplest types of programs. Those interested can go to the valley of the infinite wealth of programming, and here we are all simple - the program consists of expressions, in turn, consisting of simple functions with some arity, variables and constants. Each expression counts one of the values ​​returned by the program.

For example: some individual program square from two expressions, trying (not very successfully) to solve a quadratic equation:
function square(a, b, c){ x1 = min(sin(b)*(a+1), 0); x2 = 3 + exp(log(b*a)); return {x1, x2}; } 

With the presentation decided, now we need to deal with storage. Since there are still many dances around these programs, including transferring them from one part of the system to another (which, generally speaking, in my case were generally written in different languages), the storage of our individual in the form of a tree is not very comfortable. For presentation in a more convenient way (ideally, a set of lines over some final alphabet), our individual-program-set-of-trees will have to learn to encode / decode.

It seems like a tree, but it seems not

So, it is necessary to present the tree as a string. Here we will rescue the power of karva-trees. First of all, you should decide on a set of functions, variables and constants that can be caught in the tree. Variables and constants correspond to the leaves of the tree and will be called terminals, functions - to the rest (internal) nodes of the tree, referred to as non-terminals. It is also worth paying attention to the fact that functions may have a different number of arguments, therefore such knowledge (“arity”, - the word quietly ran through the lips of experts) we will even need. The result is an encoding table, for example, like this:
No terminalsTerminals
No0one23fourfive6
badgen+*if2ab
arityone22four000
value- {a1}{a1} + {a2}{a1} * {a2}if {a1}> {a2}? {a3}: {a4}2ab


Here n, +, *, if are functions; 2 - constant; a and b are variables. In real problems, the table is half-weighted, with such a set and a quadratic equation cannot be solved. You should also bear in mind the fact that in order to avoid division by zero and other apocalypse scenarios, all functions must be defined on the whole set of real numbers (well, or what kind of set you use there in the task). And then you have to sit on guard, catch logarithms from zero and then figure out what to do with it. We are not proud people, we will go the easy way, eliminating such options.

So, using such a table, it is not a problem to drive functions from a tree to a row and back. For example, this line came to us for decoding:
2 1 2 3 0 4 5 2 1 5 0 6 6 6 4 4 4 1 3 2

According to the table, we identify each element, remembering also about arity:


Now, using arity, we place links to the function arguments:

I ask you to pay attention to the fact that the last 3 elements of the list were not needed by anyone, and their values ​​do not affect the result of the function. This was due to the fact that the number of involved list items, the number of tree nodes is constantly floating, depending on their arity. So it is better to collect in reserve than to suffer with an incorrect tree.

Now, if we pull it up by the first element, then an expression tree will hang in our hand:


The value of the function can be calculated recursively through the tree, it turns out to be this:


I have dad's eyes like that

We return to the hottest - to the crossing. The operation of crossing programs, we set the following conditions: first, two crossing individuals give two descendants (that is, the population size is constant); secondly, as a result of crossing, the descendants should, to a certain extent, have the characteristics of both parents (that is, an apple should not roll very far from an apple tree). We now know how the program will be presented - this is a set of rows or trees. Accordingly, they can be crossed as strings or as trees.

Crossing trees is an exchange of randomly selected branches. Crossing lines can be implemented in several ways: single-point recombination (piecewise gluing), double-wall recombination, element-by-element exchange, and others. They can be described with long complex subordinate sentences with part-time turnovers, but it’s enough to look at:



It is worth noting that the gluing sites in recombination are chosen randomly, as well as in the element-by-element crossing, the exchange takes place with a certain probability. Crossing trees in terms of heredity looks more promising, but it is more difficult.

Hey, this girl is with me!


We dealt with the most intimate part of the process (many have already felt through this article how poor the personal life of the author is). Now from the relationship between a pair of individuals move on to the social foundations.

Individuals are divided into generations. The new generation consists of children of individuals of the previous generation. It turns out that there is a current generation of sons and daughters, a generation of fathers and mothers, grandmothers and grandfathers, great-grandmothers, and so on down to the zero generation — the progenitors of all proud people. Every individual of a new generation after birth tries to solve a problem, some divine fitness function evaluates its actions, and depending on its assessments of the activities of a youngster, an individual gets some chances to reproduce, that is, to fall into the class of the best representatives of the generation chosen to continue the race. Our world is harsh and cruel, and according to all the canons of dystopia (or according to the ideas of the Fuhrer, as you wish), unfit parents retired to anything after the fulfillment of their offspring birth mission go on a journey on gazenvagen, freeing living space for a couple of their children. Children follow in the footsteps of their parents, and so on from generation to generation.

The very function of fitness (or fitness function), which gives quotas for mating, should adequately assess the ability of an individual to solve a problem, and give a numerical expression of this fitness (the greater the value - the better the fitness). For example, in the case of the quadratic equation itself, this can be a measure of how close the value of the left side of the equation is to zero with the substituted x1, x2 values ​​calculated by an individual program.

The fitness function gives each individual a certain number of generations showing its usefulness and fitness. This value will influence the selection procedure (selection): the greater this value for an individual, the more likely it is to find a pair to cross (and not even one). In practice, after calculating the fitness for all individuals of the generation, we normalize these values ​​(so that the sum of fitness of individuals is 1) and lots are drawn for each of the kissing places (random number from 0 to 1), which determines the lucky one. The alpha male can get a few places, the loser will not get anything and will remain alone with the worn calendar 1994 with Pamela . This method of selection is called “selection by the roulette method”, and schematically it looks something like this:


There are other ways of breeding, but they all adhere to the general rule: the more an individual's fitness, the more she should participate in crossing. Also in the process you can include the option of elitism, when the best representative of the generation receives a merit for the service to the Fatherland in the form of additional years of life: it passes to the next generation without changes, although it can simultaneously make children. This allows us not to lose a very good solution, which can be destroyed in the process of crossing.

Immediately mention the mutation. This operation randomly changes the fragment of an individual with a certain small probability, which makes it possible to vary the gene pool. A useful thing, suddenly such a mutation of lactose will help break down! And if not, and one more hand is superfluous - then you will suffer with it until the end of your days, the offspring is still not enough to give chances.

Creations of the World and the Apocalypse


We found out how to go from generation to generation, now the question is: “what was the root cause, how did it all start?” Unlike in your world, we don’t need to come up with tricks like “big bang” or “7 days” to explain such things. Here the answer is very clear - it all started with the zero generation, which was created randomly. Yes, just generate random lines / trees. The only requirement is the correctness of the individual, and no one cares how bad it is; selection will do its job.

Our world exists as long as we need. We either set the bar for fitness that satisfies us, and when a fairly steep individual appears, stop the process, or check how much the individuals of the generation differ from each other. It is logical that if the whole generation consists of identical twins, further mating will not give anything new to the gene pool, and it is naive to hope for one mutation. You can also set a time limit.

Hey, you! Haroshsh soar brain! What is the result?


Let's pause in this fascinating verbiage and look back (well, that is, up). To sum up, the genetic algorithm looks like this:

We are learning to represent the solution of the problem in the form of an individual of a genetic algorithm — a list of fixed length over some alphabet. After that, we select the fitness function, which could evaluate the individuals, and randomly generate the zero generation. Here begins the cycle of free love: the fitness of individuals of a generation is calculated, couples are formed from these data (losers are thrown out, and alpha males are not limited to one pair), the remaining mates give birth to a couple of children (which also mutated) and impose hands on themselves. This continues until the elect is found, or changes cease to please us, or we are tired of this whole thing. Well, how can I do without the scheme:


Part two. The role of the genetic algorithm in the image of the Robocode bot.


Something the first part was delayed, we are all tired, so we will not repeat. We also omit some implementation features.
You can find out what Robocode is here: habrahabr.ru/blogs/programmers_games/59784 (the pictures are lost though). In short, this is a programmer game originally created for learning the peculiarities of the Java language, which allows participants to create their own bot robots and arrange battles between them. Each participant writes Java code that controls a small tank, and fights with other similar tanks.

We are faced with the following task: the development of an automated bot-tank control system using a genetic algorithm. The robot must be created and modified automatically, i.e. in the course of their evolution, “adapt” to a specific and pre-selected opponent in 1-for-1 battles.

How to present the solution to the problem as an individual


First, we define the capabilities of the tank. The list of basic actions that a robot can perform during a fight is limited to four points: turn the gun, turn the body, shoot, move. The fifth action, the rotation of the radar, we excluded from consideration by implementing it is trivial - a constant rotation (thus, the tank will always have relevant information about the position of the enemy).

Obviously, for successful combat, these actions must be performed not chaotically, but depend on the situation (state) on the battlefield: on the position of the tanks, their speeds, energy and other parameters. Thus, the process of controlling a tank is reduced to performing the above-described actions based on the state of combat. The law that determines the behavior of a tank (its actions) on the basis of the situation on the battlefield, we will call the control function, and it will be an individual of our genetic algorithm.

Since the control function must return 4 values ​​(the energy of the shot, the angle of rotation of the tower, the angle of rotation of the hull, the movement of the tank), then, as explained in the last part, it will consist of four expressions, i.e. of four rows / trees.

To compile a coding table, it is necessary to decide on a set of basic functions, variables and constants.

Functions:
+ (x, y) = x + y
++ (x, y, z) = x + y + z
n (x) = -x
* (x, y) = x * y
** (x, y) = x * y * z
min (x, y) = x> y? y: x
s (x) = 1 / (1 + exp (-x))
if (x, y, z, w) = x> y? z: w

Variables:
x, y - coordinates of the opponent’s tank relative to our tank;
dr - the distance that remains to "get" to our tank;
tr is the angle left to turn by our tank;
w is the distance from our tank to the edge of the field;
dh is the angle between the direction to the opponent's tank and the gun of our tank;
GH - the angle of rotation of the gun of our tank;
h is the direction of movement of the opponent's tank;
d is the distance between our tank and the opponent's tank;
e is the energy of the opponent's tank;
E is the energy of our tank.

Well and constants: 0.5, 0, 1, 2, 10

Fitness function


We describe how the fitness function was chosen. The results of the battle "Robocode" forms on the basis of many nuances. This is not only the number of victories, but also all kinds of points for activity, for survival, for hitting an opponent, etc. As a result, “Robocode” ranks the robots by the parameter “total scores”, which takes into account all the subtleties described above. We will use it when calculating the fitness of an individual: the final fitness will be equal to the percentage of our tank’s points of the sum of points of both tanks, and takes a value from 0 to 100. Accordingly, if the fitness value is more than 50, then our robot scored more points than the rival is therefore stronger than him. Note that according to such a counting system, the first place is not always the one who won in most rounds of the battle. Well, here we throw up our hands with the phrase about the scooter: the creators have defined the criteria, we follow them.

Generally speaking, the calculation of individual fitness includes a series of battles! Those. such a seemingly insignificant point, as a miscalculation of fitness, consists of such dances with a tambourine:
1) Our system stores the encoded chromosomes of an individual in the file chromosome.dat;
2) For each individual, the “Robocode” environment is launched, which organizes the duel. At the entrance to it, we submit a file format .battle, describing the conditions of the battle - a list of fighting tanks, the size of the field, the number of rounds, and more;
3) For the battle, Robocode loads the tanks, our shell robot reads the chromosome.dat file with the coded behavior, interprets it into a set of actions and fights according to them;
4) At the end of the fight, the Robocode environment records the result of the battle in the file results.txt and completes its work;
5) Our system selects this file, the parsit and extracts from it the total score of our tank and the opponent. By simple arithmetic we obtain the value of fitness.

How are our them, huh?


Let's sum up our design bureau. Our system consists of two parts (programs). The first one, based on the genetic algorithm, collects an individual and stores it as a set of strings, and the second (robot code) interprets it (processing into an expression tree) and controls the tank (calculating the value of expression trees for a given variable, that is, the current state) by recursive walking. fight). The first program is written in the SI language, the second - in the Java language.

When implementing the genetic algorithm, the number of individuals in the population was chosen to be 51 (25 pairs + one elite individual). One step of evolution (population change) takes about a dozen minutes, therefore, in total, the case is delayed for several hours.

As a result, we will demonstrate the results of creating an opponent to the Walls and Crazy robots:


In the first case, we stopped the process after reaching one of the individuals of fitness 70, in the second we had enough that the average fitness of individuals of a generation exceeds 50.

After contemplation, rinse eyes with alcohol.


If someone is not afraid to cry with bloody tears in convulsions from contemplating bydlokodinga (especially the hair will begin to move from the robot's code - we have mutual hatred with java), then I attach the sources of this project. It was developed and tested under Linux (last ubunt) with Robocode 1.7.2.2. The robot name is Hulk (because the victim is a mutation). He will tell you the rest himself (using readme).

Finally, I apologize for the scientific ignorance and puns, I was dropped from the bedside table as a child. I myself am always afraid that they will catch me, put me on a steamer and let me down the Volga. I look back.

upd: the method described above is the easiest and most primitive. In order for a built robot to be truly competitive for real robocode battles, continuous file work is necessary: ​​careful selection of basic elements (functions, variables and constants), probabilities, methods of crossing, selection and mutation ... with an appropriate time to evaluate each option. Therefore, in the comments it is reasonable to find suggestions for improving the bot, which anyone can do.

Source: https://habr.com/ru/post/111811/


All Articles