Wandering through the expanses of the Internet, I became interested in such a thing as genetic programming . If in a nutshell, it is the automatic creation of programs that perform a particular goal, in accordance with the principle of natural selection. That is, first a generation of “creatures” programs is created at random, which are sorted according to different criteria (proximity to the goal), then some of them mutate (also randomly), some die out and some are replaced by new random creatures.
Thus, the most worthy creatures continue their work and give descendants, while the weakest are eliminated in the selection process. Several experiments and their results - under the cut.
The first tests. Passage through the maze
As a test of the pen, I decided to teach the creatures (we will call them that because the term “program” is too loud for this case) to look for a way in the maze, as in the picture above. At once I will make a reservation that in this case the goal was not to derive a creature that would find a way in an arbitrary maze with the help of genetic engineering, but to find a way in one particular maze, and so that creatures had no senses. After some thought, it was decided to use a simple set of commands that can be easily mutated and executed by the creature. So, our “program” or “genome” is just a set of commands in which direction the creature should move. In addition, a loop was added that will be able to repeat the program section several times. The genome may be, for example: ')
Here the FOR (IxT) command means that the following I instructions will be executed T times. A creature with such a genome will pass 2 cells to the right, then 2 cells down and again 2 cells to the right.
Now we will deal with our being, which is essentially an interpreter of the genome. Each of our creatures will have a “name”, or an identifier, by which we will recognize it later, a “gene”, according to which our creature will act, the Step method, by which the creature will perform the next gene (at the same time we will check if creature through the walls), the Fitness function, which will determine how close the creature is to the finish cell, and the Mutate method, which will change part of the genome. About mutations a little later, but for now let's see how our cute box performs our test program:
Fitness, or compliance with our requirements will be considered as the sum of the differences of coordinates of the finish and the creature. So, the creature with the shortest distance to the target will be the most appropriate.
Evolution
Creatures will evolve generation after generation, with 100 creatures in each. Within each generation we will repeat the same steps:
Check creatures for compliance (calculation of Fitness);
Sort creatures by increasing the distance to the target;
Depending on the position in the list: - either we will not change the creature, with the assumption that it is already good enough, and there is no need to spoil it; - either we change the genome: we delete, add, change part of the genes; - or we replace it with a new random creature, while the old one “dies out” as unworthy of the continuation of the race.
By the method of typing, we will choose this attitude: we do not change the top 10, the next 60 mutate, the other 30 die out. We try, on the 292 generation we get the result:
As you can see, the creature often marks time in one place, which, in general, is not very good. Let's try to take into account the length of the genome. But we will put it on the second place in importance after the distance to the finish, so that it does not turn out that the short programs of 2-3 genes that are at the start turned out to be better compared to those that are closer to the goal, but have a long genome. Method, again, spear, try the formula:
Now the search for a solution goes a little longer. At 335 generation, we get exactly the same result, but with the following genome:
Actually, there is still much that can be improved, but the problem is solved, with an acceptable quality and with acceptable labor costs. I intentionally do not cite my code due to its flaws. Therefore, with a clear conscience, you can go to the topic in the title.
ELTRUT problem
In 2012, a graduate of the University of California, Benjamin Bush was engaged in genetic algorithms and, in particular, the so-called ELTRUT-problem. The essence of the problem is this: there is a bitmap image, find a sequence of commands for the logo bug , which will draw a similar image. Here is a video where everything is fairly accessible explained:
In short, Benjamin proposes this solution: discard the simple linear genome in favor of the two-dimensional one. Why? Because the two-dimensional genome is redundant, so that not every mutation will provoke a radical change in the genome. Benjamin also offers an interesting form of representation of the genome: each pixel of the image is associated with a gene that points in which direction and how far the turtle will go, if it gets to this point, and whether it needs to draw a line. Visually, the genome can be displayed like this:
Here the excess genome is highlighted in brown. It is seen that in almost 80% of mutations there will be no change in the behavior of the bug. In addition, when solving this problem, it is proposed to use “sexual reproduction”, or crossing-over : two creatures of the previous generation exchange genes to get descendants. Next comes the three-dimensional genome and the algorithm optimization, but I didn’t want to do optimization, and the three dimensions are a bit too much to familiarize with the genetic algorithms, so I decided to dwell on the implementation of the simplified algorithm.
So, similar to the previous task, in each generation we will have 100 creatures. In each generation we will:
Cross the 50 best creatures and replace the 50 worst with their descendants;
Change one gene to each creature, except the top 10;
If the creature has already completed its task, then it will not mutate.
How will we determine the degree of conformity of creatures? For each pixel correctly drawn we will add a certain amount of points to the creature, we will penalize for each incorrectly drawn pixel. If there, where pixel is not painted over, and it should not be painted over, do not change the account. Total approximate scoring algorithm:
We will immediately solve another problem (it should be solved by the three-dimensional genome): if the bug gets on the pixel where it already was, then the program will loop. Therefore, we will make it so that stepping on the visited pixel, the turtle stops. At the same time, we will not chase the poor reptile until our blue spots.
So, we try. Image 5x5, not very difficult, can we get the result?
For 61 generation, the program found an algorithm for the bug:
Not bad, but how will the program cope with a somewhat large image, for example, 8x8?
The video is greatly accelerated (the previous one was also accelerated, but slightly). 2000 generations did not give the correct result. One could try to continue, but as can be seen from the schedule, for about 1000 generations no changes occurred, so this is unlikely to solve the problem. Well, the dislike for optimization did the trick: it took about 2 hours to image 8x8 on my computer. So, the experiment is recognized as partially failed.
Thanks for your time, I hope this article will seem interesting to someone.