
Foreword
Hello, Habr! I would like to offer you a simple applied lesson on genetic algorithms. If you are familiar with them and working with them, then reading it is a waste of time. This lesson is for those who want to start using, but do not know how. It is assumed that you are already familiar with the meaning of genetic algorithms, imagine a little how they work.
The essence of genetic algorithms
Some may skip this part. A lot of where it was said, I just repeat for order. Like any multivariate optimization algorithm, genetic algorithms are designed to minimize the objective function and give the minimum set of input arguments. A pleasant feature of genetic algorithms is the theoretical possibility of obtaining a global minimum of the function, read "get the best possible answer." How this is achieved, I will explain a little later. Genetic algorithms, in their essence, are nothing but an approximate simulation of the evolution process. They operate with the concepts of "gene" and "population". A “gene” is a set of input arguments, and a “population” is a set of genes. In classical genetic algorithms to get the gene from the function arguments, they are simply written in a row in binary code. We get such a huge binary snake. As in the natural process of evolution, there is a natural selection. Those genes for which the value of the function is the most are eliminated, the best ones go to the next generation without changes (elite). Others intersect with other genes (cross-over). The way of crossing can be a miscellaneous. In the simplest case, each gene from an arbitrary pair is cut into two parts and the end of the second is attached to the beginning of the first gene, and the end of the first is attached to the beginning of the second. The proportions of the section, of course, should be equal for each of this pair. There is one more, usually not numerous, group of mutant genes, in which one or several bits change randomly. It is they and the cross-over that provide a global minimum, even if we are in a local one.
Formulation of the problem
In the VC social network there is a rather interesting game called “Turn on the light”. It was she who aroused the desire to write an article, since the tasks that the user solves in it are just great for genetic algorithms. In three words, the essence of the game is as follows: it is necessary to switch the power station and the houses with cables. On each cell of the playing field there is one of the four types of cables: L-shaped, T-shaped, I-shaped, half I-shaped to connect the house (denoted by H). The player is free to rotate them in increments of 90 degrees. In general, it is better to try first to grasp.
Have tried? Go ahead. It is necessary to build the objective function of the state of the playing field. Input data is the angle of rotation of each of the cable fragments. It is easy to understand that you can encode it with 2 bits. The whole field at the beginning of the game 5x5 kletok. Therefore, the length of the gene in the binary code will be 5x5x2 = 50 bits. The objective function will take the array of zeros and ones as input (there is such an opportunity in Matlab).
How to evaluate the quality of the state of the playing field? I suggest using the number of connections between adjacent cells. The score of the playing field is the sum of the cell ratings. Each connection of a cell with a neighbor is a score. Neighbors in cells 4: upper, lower, left, right.
Features of the implementation
I would not want to paste the source code and disassemble them line by line, as some people like to do it on Habré. For this, the git repository will also come down:
github.com/player999/turnonthelight . Comments in the readme should help. Now I advise you to open the source and read them along with the article. I deliberately made them quite simple. Just touch on the important points of implementation. The input of the program gets the initial state of the playing field in a text file approximately as follows:
H1 H0 I1 I1 I3
I1 H2 T1 T1 I3
L3 L2 H1 L0 H1
H0 L1 T3 I0 T3
L0 I0 L0 L0 L0
Here the letter is a cell type, and the number is an orientation from 0 to 270 degrees in increments of 90 degrees clockwise. This field is read using the scheme_config function, which is called from the main run script. Then you can apply the genetic algorithm to it by calling the function solve_scheme. In the run script, the target function is denoted as fun, which is a wrapper for the eval_bits function. eval_bits accepts two binary strings that describe the playing field. The first is cell types, the second is their orientation. Naturally, during the operation of the genetic algorithm, the first parameter should be constant. The player can not change the types of cells. These strings are built up of the array of cells of the playing field with the function scheme2string. eval_bits is a wrapper for eval_scheme. This is the core of the objective function, and eval_bits simply converts the arguments into a matrix that describes the playing field.
I hope it is clear how the objective function works. Now to the very minimization. First you need to get a structure with the options of the genetic algorithm. There are many of them and for every taste. But indicated only the most important.
options = gaoptimset('PopulationType','bitstring','PopulationSize',1000,'Generations',100);
PopulationType - the representation of the arguments. In our case, the bit string. And even though the word "string" appears here, in fact it is an array of double ones and zeros.
Population Size - how many genes will be on each iteration. The more genes, the better the algorithm will work, but at the same time it will be slower. For each vedb gene, one needs to calculate the objective function.
Generations - the number of iterations of the genetic algorithm. This is one of the possible exit conditions. Another, for example, is a way out when the change in the objective function stops.
This is followed by a call to the optimization function:
opt = ga(fun,length(pos),[],[],[],[],[],[],[],options);
Arrays with restrictions are omitted here, since they are not relevant for this task. length (pos) represents the number of arguments. More details about these features can be found in the Matlab Help, it is gorgeous. I also recommend playing with the Optimization Toolbox settings (optimtool command), all the same settings clothed in a graphical interface for convenience.
Next, the best gene is transformed using string2scheme back into the playing field and the play_field is drawn by the draw_scheme function.
')
findings
Although the algorithm also requires adjustment, but it confidently proved that it can search for solutions to the puzzle. I hope the article has helped you a little, or at least entertained. Here is the result of the work, for example.
