
Recently, more and more "talk" goes about new-fangled algorithms, such as neural networks and the genetic algorithm. Today I will talk about genetic algorithms, but this time let's try to do without abstruse definitions and complex terms.
As one of the great scientists said: “If you cannot explain your theory to your wife, your theory is worth nothing!” So let's try to figure everything out in order.
Pinch of history
As Wikipedia says: "The founding father of genetic algorithms is John Holland, who invented using genetics for his own purposes in 1975." For reference, the same year Altair 8800 appeared, and no, this is not a terrorist, but the first personal computer. By that time, John was already 46 years old.
Where is it used
Since the algorithm is self-learning, the range of applications is extremely wide:
- Tasks for graphs
- Layout tasks
- Scheduling
- Creation of "Artificial Intelligence"
Operating principle
The genetic algorithm is primarily an evolutionary algorithm, in other words, the main feature of the algorithm is the crossing (combination). As it is easy to guess, the idea of the algorithm is brazenly taken from nature, since it will not be sued for it. So, by going through and most importantly the selection turns out the right "combination".
The algorithm is divided into three stages:
- Crossbreeding
- Selection (selection)
- Formation of a new generation
If the result does not suit us, these steps are repeated until the result starts to satisfy us or one of the following conditions occurs:
- The number of generations (cycles) will reach a pre-selected maximum.
- Time for mutation exhausted
More on steps
Creating a new population . At this step, an initial population is created, which, quite possibly, will not be kosher, but it is likely that the algorithm will correct this problem. The main thing is that they conform to the "format" and be "adapted to breeding."
Reproduction . Well, everything is like that of people, it takes two parents to get a descendant. The main thing is that a descendant (child) can inherit their features from their parents. In this case, everyone breeds, not only the survivors (this phrase is especially absurd, but since we are all in a spherical vacuum, then everything is possible), otherwise one alpha male will stand out, the genes of which will block all the others, but for us this is not acceptable in principle .
Mutations . Mutations are similar to reproduction, from mutants choose a certain number of individuals and change them in accordance with pre-defined operations.
Selection Here the sweetest begins, we begin to choose from the population the proportion of those who "go on." At the same time, we define the share of “survivors” after our selection in advance with our hands, indicating as a parameter. Sadly, the rest of the individuals must die.
Practice
You have successfully listened to the "fairy tale" about the miracle algorithm, and it is quite possible that you were waiting for when we begin to exploit it at last, I want to please you, the time has come.
Let's look at the example of my favorite Diophantine equations (Equations with integer roots).
Our equation: a + 2b + 3c + 4d = 30You probably already suspect that the roots of this equation lie on the interval [1; 30], so we take 5
random values a, b, c, d. (The limit of 30 is taken specifically to simplify the task)
And so, we have the first generation:
- (1,28,15,3)
- (14,9,2,4)
- (13,5,7,3)
- (23,8,16,19)
- (9,13,5,2)
In order to calculate the survival rates, we substitute each solution in the expression. The distance from the received value to 30 will be the desired value.
- | 114-30 | = 84
- | 54-30 | = 24
- | 56-30 | = 26
- | 163-30 | = 133
- | 58-30 | = 28
Smaller values are closer to 30, respectively, they are more desirable. It turns out that larger values will have a lower survival rate. To create a system, we calculate the probability of choosing each (chromosome). But the solution is to take the sum of the inverse of the coefficients, and from this calculate the percentages. (
PS 0.135266 - the sum of inverse coefficients )
- (1/84) /0.135266 = 8.80%
- (1/24) /0.135266 = 30.8%
- (1/26) /0.135266 = 28.4%
- (1/133) /0.135266 = 5.56%
- (1/28) /0.135266 = 26.4%
Next, we will choose five pairs of parents who will have exactly one child. We will give the chance to the case exactly five times, each time the chance to become a parent will be the same and will be equal to the chance for survival.
3-1, 5-2, 3-5, 2-5, 5-3
As mentioned earlier, the descendant contains information about the genes of the father and mother. This can be achieved in various ways, but in this case a “crossover” will be used. (| = dividing line)
- H.-father: a1 | b1, c1, d1 H.-mother: a2 | b2, c2, d2 X.-descendant: a1, b2, c2, d2 or a2, b1, c1, d1
- H.-father: a1, b1 | c1, d1 H.-mother: a2, b2 | c2, d2 X.-descendant: a1, b1, c2, d2 or a2, b2, c1, d1
- H.-father: a1, b1, c1 | d1 H.-mother: a2, b2, c2 | d2 X.-descendant: a1, b1, c1, d2 or a2, b2, c2, d1
There are so many ways to transfer information to a descendant, and a cross over is only one of many. The location of the separator can be completely arbitrary, as the father or mother will be to the left of the dash.
And now we will do the same with the descendants:
- H.-father: (13 | 5,7,3) H.-mother: (1 | 28,15,3) H.-descendant: (13,28,15,3)
- H.-father: (9,13 | 5,2) H.-mother: (14.9 | 2,4) H.-descendant: (9,13,2,4)
- H.-father: (13,5,7 | 3) H.-mother: (9,13,5 | 2) H.-descendant: (13,5,7,2)
- H.-father: (14 | 9,2,4) H.-mother: (9 | 13,5,2) H.-descendant: (14,13,5,2)
- H.-father: (13,5 | 7, 3) H.-mother: (9,13 | 5, 2) H.-descendant: (13,5,5,2)
Now we calculate the survival rates of the offspring.
- (13,28,15,3) - | 126-30 | = 96 (9,13,2,4) - | 57-30 | = 27
(13,5,7,2) - | 57-30 | = 22
(14,13,5,2) - | 63-30 | = 33
(13,5,5,2) - | 46-30 | = 16
Sadly, the average fitness of the descendants was 38.8, and for parents this coefficient was 59.4. At this very moment it is more expedient to use a mutation, for this we replace one or more values with a random number from 1 to 30.
The algorithm will work until such time as the survival rate is zero. Those. will be a solution to the equation.
Systems with a larger population (for example, 50 instead of 5 converge to the desired level (0) more quickly and stably.
Code
This is where the simplicity ends and the wonderful C ++ begins ...
A class in C ++ requires 5 values during initialization: 4 coefficients and the result. For the example above, it would look like this: CDiophantine dp (1,2,3,4,30);
Then, to solve the equation, call the Solve () function, which will return the allele containing the solution. Call GetGene () to get a gene with the correct values a, b, c, d. The standard procedure main.cpp using this class can be:
')
#include "<iostream.h>" #include "diophantine.h" void main() { CDiophantine dp(1,2,3,4,30); int ans; ans = dp.Solve(); if (ans == -1) { cout << "No solution found." << endl; } else { gene gn = dp.GetGene(ans); cout << "The solution set to a+2b+3c+4d=30 is:\n"; cout << "a = " << gn.alleles[0] << "." << endl; cout << "b = " << gn.alleles[1] << "." << endl; cout << "c = " << gn.alleles[2] << "." << endl; cout << "d = " << gn.alleles[3] << "." << endl; } }
The CDiophantine class itself:
#include <stdlib.h> #include <time.h> #define MAXPOP 25 struct gene { int alleles[4]; int fitness; float likelihood; // Test for equality. operator==(gene gn) { for (int i=0;i<4;i++) { if (gn.alleles[i] != alleles[i]) return false; } return true; } }; class CDiophantine { public: CDiophantine(int, int, int, int, int);// Constructor with coefficients for a,b,c,d. int Solve();// Solve the equation. // Returns a given gene. gene GetGene(int i) { return population[i];} protected: int ca,cb,cc,cd;// The coefficients. int result; gene population[MAXPOP];// Population. int Fitness(gene &);// Fitness function. void GenerateLikelihoods(); // Generate likelihoods. float MultInv();// Creates the multiplicative inverse. int CreateFitnesses(); void CreateNewPopulation(); int GetIndex(float val); gene Breed(int p1, int p2); }; CDiophantine::CDiophantine(int a, int b, int c, int d, int res) : ca(a), cb(b), cc(c), cd(d), result(res) {} int CDiophantine::Solve() { int fitness = -1; // Generate initial population. srand((unsigned)time(NULL)); for(int i=0;i<MAXPOP;i++) {// Fill the population with numbers between for (int j=0;j<4;j++) {// 0 and the result. population[i].alleles[j] = rand() % (result + 1); } } if (fitness = CreateFitnesses()) { return fitness; } int iterations = 0;// Keep record of the iterations. while (fitness != 0 || iterations < 50) {// Repeat until solution found, or over 50 iterations. GenerateLikelihoods();// Create the likelihoods. CreateNewPopulation(); if (fitness = CreateFitnesses()) { return fitness; } iterations++; } return -1; } int CDiophantine::Fitness(gene &gn) { int total = ca * gn.alleles[0] + cb * gn.alleles[1] + cc * gn.alleles[2] + cd * gn.alleles[3]; return gn.fitness = abs(total - result); } int CDiophantine::CreateFitnesses() { float avgfit = 0; int fitness = 0; for(int i=0;i<MAXPOP;i++) { fitness = Fitness(population[i]); avgfit += fitness; if (fitness == 0) { return i; } } return 0; } float CDiophantine::MultInv() { float sum = 0; for(int i=0;i<MAXPOP;i++) { sum += 1/((float)population[i].fitness); } return sum; } void CDiophantine::GenerateLikelihoods() { float multinv = MultInv(); float last = 0; for(int i=0;i<MAXPOP;i++) { population[i].likelihood = last = last + ((1/((float)population[i].fitness) / multinv) * 100); } } int CDiophantine::GetIndex(float val) { float last = 0; for(int i=0;i<MAXPOP;i++) { if (last <= val && val <= population[i].likelihood) return i; else last = population[i].likelihood; } return 4; } gene CDiophantine::Breed(int p1, int p2) { int crossover = rand() % 3+1;// Create the crossover point (not first). int first = rand() % 100;// Which parent comes first? gene child = population[p1];// Child is all first parent initially. int initial = 0, final = 3;// The crossover boundaries. if (first < 50) initial = crossover; // If first parent first. start from crossover. else final = crossover+1;// Else end at crossover. for(int i=initial;i<final;i++) {// Crossover! child.alleles[i] = population[p2].alleles[i]; if (rand() % 101 < 5) child.alleles[i] = rand() % (result + 1); } return child;// Return the kid... } void CDiophantine::CreateNewPopulation() { gene temppop[MAXPOP]; for(int i=0;i<MAXPOP;i++) { int parent1 = 0, parent2 = 0, iterations = 0; while(parent1 == parent2 || population[parent1] == population[parent2]) { parent1 = GetIndex((float)(rand() % 101)); parent2 = GetIndex((float)(rand() % 101)); if (++iterations > 25) break; } temppop[i] = Breed(parent1, parent2);// Create a child. } for(i=0;i<MAXPOP;i++) population[i] = temppop[i]; }
The article is based on Wikipedia and the site algolist.manual.ru .