Most recently, I wrote
an article in which, without explanation, I showed what the method of grammatical evolution is capable of. I fully agree that it is impossible to do this, but as I wanted to show the results of an interesting method. I thought "what would be better: translate the source or give my own explanation." Laziness got the better.
If someone is interested in evolutionary methods and the task of symbolic regression (and not only), then I ask you to read.
Backusa Naur Form
First you need to say what context-independent grammar is in the form of Backus-Naur (abbreviated BNF). About formal languages ​​and grammars on Habré there is already quite interesting
article . I highly recommend reading. But our goal is to understand what the BNF is and learn how to use this form.
')
Wikipedia gives a completely adequate definition:
BNF is a formal syntax description system in which certain syntactic categories are sequentially defined through other categories. BNF is used to describe context-free formal grammars.
BNF has terminal and non-terminal symbols. Terminals - constants. We set them up and that's it, but with non-terminal symbols everything is much more interesting: they can be substituted into each other by the rules.
Consider an example. We have the following formal system for describing the syntax of context-free grammar:
In our example, many non-terminal characters
And the set of terminal symbols is represented as
The set S will contain one element.
This element will be the entry point for building and working with rules.
Now, having one non-terminal element, we can make constructions that will correspond to our grammar.
Watching the chains
1) <sntnce> => <adverb> <verb> => quickly <verb> => quickly ran2) <sntnce> => <noun> <verb> => Peter <verb> => Peter fellThe question arises: on what basis do these substitutions occur? I will answer this question a little later.
Genetic algorithm
It seems to me that this algorithm is canonical in evolutionary circles. It is simple to implement and works well. Consideration of this algorithm is necessary in order to understand what the mechanism will be as a “engine” for the method of grammatical evolution. But (!!) in its place can be any other, convenient for you, evolutionary algorithm.
So, GA uses the behavior of nature. In fact, nothing new was invented. This algorithm has been running for millions of years. Thanks to him. After all, if not for him, then we would not be.
GA consists of several stages.
(1) Creating an initial population (creating a chromosome for each individual)(2) Calculating the fitness function of each individual (it is this that shows who is best suited in a given population)(3) Selection of the best representatives for the education of further offspring.(4) Crossover(5) Mutation(6) After (4) we get children, some of which go through (5). At the exit we get offspring(7) The selection of fathers and children in the new generation(8) return to step (2) if the values ​​given by the children do not suit usChromosome is a coded representation of the information we need. In the source uses a binary representation. Those. if we need to find 4 parameters (each of them in the range from 0 to 15), then each parameter will require 4 bits (zero or one). And the chromosome itself will be 16. Everything is pretty primitive.
Important : you can use the decimal representation. I will do so for grammatical evolution.
Now a little about the operators in the GA and all sorts of fitness functions.
Fitness function - the functionality that we need to optimize. It varies from task to task. If there is a question in minimizing the functional, then for selection parents are needed who possess as little fitness function as possible.
The crossover is a cool thing. In genetic programming, by the way, this operator is almost the only one to produce offspring with the best qualities. The bottom line is that we take two parents (or rather their genotype). Divide it in half. And change places. I'll show you now.
There are two lists:
This was an example of a point crossover. There are other variations on this topic, but we will not discuss them.
Mutation is the process of accidentally replacing one or another gene.
Frequently used method of selection in the new generation is elite. We simply take the n best individuals from the list of children + parents. And then we supplement the population randomly to the desired amount.
Important: the size of the chromosome can be both fixed and arbitrary. The same applies to the size of the population.
Grammatical evolution
And now about the most important thing. What kind of method is it and what is it eaten with.
The very essence is that you have a task that needs to be solved. You build grammar in the form of Backus-Naur. Create an initial population in which each individual will have his chromosome, describing what rules when, where, to substitute. The importance lies in the fact that thanks to these rules you get a function that you can use for calculations. The value of this function with parameters set in advance by it (or not) can act as a functional (fitness function). The better the functional, the better the function, and hence the individual with his chromosome.
More about the chromosome.
Suppose we have the following grammar
<e> :: = <e> <op> <e> | <val>
<val> :: = x | 3 | 2
<op> :: = + | - | * | /
Next we have this chromosome
Initially, the symbolic representation of the function contains one non-terminal symbol: H = <e> (as a rule).
We take the first element from chromo: 2. We count how many rules are in the transformations in <e>: 2. Divide 2% 2 (modulo !!) = 0. So, instead of <e>, we substitute <e> <op> <e>. Thus H = <e> <op> <e>. Two from chromo throwing out. She is no longer needed.
The next step is one. and look again at <e>. 1% 2 (the number of substitution rules) = 1. So instead of <e> we substitute <val>. We get H = <val> <op> <e>.
If you make these simple manipulations further, you get such a chain
Next, we use this function to calculate the functionality and determine how much an individual with this phenotypic (so called function) and genotype (chromosome) suits us.
It's all. Yes, there are several substitution options: in depth (considered), in width, the so-called

- substitution. But this is material for a separate article.
And now an example.
There is a time interval
![t \ in [-5,5]](https://tex.s2cms.ru/svg/%20t%20%5Cin%20%5B-5%2C5%5D)
. There is a function
%20%3D%201%2Bx%2Bx%5E2%2Bx%5E3%20)
. It is necessary to find a function that would give the minimal quadratic error - the functional of the given problem.
Look at the code
main module
The foo function generates values ​​that we will compare with those that will be obtained from the functions created by each individual.
Chromosome length = 15;
Population length = 100;
The number of iterations (evolution) = 20;
Parser module
The grammar dictionary contains rules for grammar. Further, the substitution algorithm, which I described above. After the while block is needed to complete the function. There are cases when the chromosome is over, and non-terminal characters remain. Here is the last cycle and is needed in order to replace all non-terminals with terminals.
Important : not the fact that the final function will turn out to be valid from the point of view of semantics (it can divide by zero and all that).
GE module
The usual implementation of the genetic algorithm.
The GA function is the entry point to the evolutionary cycle.
In general, the functions speak for themselves, and the implementation of each function is not very long. I note that the selection for the crossover is not standard. I just bother my parents and choose the first half of the mixed pile. For this task it is not very harmful, but it is better not to do so. There are a dozen (or even more) methods that allow you to select the best candidates for the crossover.
Each individual has three fields: genotype, phenotype, value of fitness function.
Eventually
getting function
which is identical
I note that the creation of an optimal grammar is not a trivial and very important task.
I hope it has now become clear what kind of “grammatical evolution” is for the method, and how to use it to solve problems. An interesting fact is that symbolic regression is applicable not only for functions. We can create instructions for robots, as well as build the structure of the neural network and adjust the parameters in it. We do not need a back propagation method for learning the network. Along with it, you can refuse from the functions of activations with a continuous first derivative. This is the so-called "neuroevolution programming".
Once again I give a link to the
source . If anyone wants to further understand the method.