EvoJ is a convenient framework for genetic algorithms.

Hello colleagues!

There are often articles on the topic of genetic algorithms, and allow me to deposit my five kopecks.

For a couple of years now, I have been developing a Java framework dedicated to GA for my hobby. When I first started working with GA, the biggest inconvenience was the need to vectorize the variable components of the solution, so in my framework I tried to make the vectorization of variables transparent to the programmer, putting all the dirty work on the framework’s shoulders. In addition, since GA is very well parallelized, I tried to make the transition to multi-threading no less easy.

To understand the basic idea of the EvoJ, consider the following example. For simplicity, we assume that we need to find the minimum of the function Z(x, y)=12 * x^2 + 8 * x + 9 * y^2. Naturally, this task can be more effectively solved without resorting to the GA, but first let's consider it, and then analyze the example revealing all the power of EvoJ.
')
So, first of all we need to create a Java interface containing the variables we need. We have two variables: X and Y; we will create an interface containing getters for these variables.

 public interface Solution { @Range(min = "-10", max = "10") double getX(); @Range(min = "-10", max = "10") double getY(); }

Please note that it is not necessary to declare setters, EvoJ will be able to change variables without them. However, if you want to implement your own mutation strategy, you will have to declare the setters - otherwise you will not be able to change the variables yourself.

Pay attention to the @Range annotation, it sets the range of values that a variable can accept. Variables are triggered by random values from a specified range. However, as a result of mutation, they can potentially go beyond the specified range. This can be prevented by using the strict=”true” parameter, which prevents the variable from accepting an invalid value, even if you try to set it using the setter method.

Another point to pay attention here is that all parameters of all annotations in EvoJ are strings, this allows you to directly specify the value of the parameter, and specify the name property instead of a specific value so that you do not strictly specify the values of the annotation parameters in the compile time.

So, we have an interface with variables, let's write a fitness function. The Fittness function in EvoJ is represented by the interface:

 public interface SolutionRating<T> { Comparable calcRating(T solution); }

Here, the parameter T is our interface with variables. The larger the return value (according to the Comparable contract), the more Comparable solution is considered. You can return null , it is considered the smallest value of the fitness function.

It is recommended to implement this interface indirectly, through helper-classes, which take on some service functions: the destruction of old solutions (if the maximum lifetime of the solution is set), the caching of the function value for solutions that were not deselected on the previous iteration of the GA.

Fittness function for our case will be as follows:

 public class Rating extends AbstractSimpleRating<Solution> { public static double calcFunction(Solution solution) { double x = solution.getX(); double y = solution.getY(); return 12 * x * x + 8 * x + 9 * y * y; } @Override public Comparable doCalcRating(Solution solution) { double fn = calcFunction(solution); if (Double.isNaN(fn)) { return null; } else { return -fn; } } }

Everything is pretty obvious here, we just take and consider our function using variables from our first interface. Since we are looking for a minimum, and the class contract assumes that the best solutions should have a higher rating, we return the value of the function times -1.

In addition, we eliminate the left-hand solutions (if NaN is obtained) by returning null .

Well, we have two main components, let's make it all work.

 DefaultPoolFactory pf = new DefaultPoolFactory(); GenePool<Solution> pool = pf.createPool(200, Solution.class, null); DefaultHandler handler = new DefaultHandler(new Rating(), null, null, null); handler.iterate(pool, 200); Solution solution = pool.getBestSolution();

Let's analyze the code above in the lines:

 DefaultPoolFactory pf = new DefaultPoolFactory();

This factory will give us many initial solutions, if you remember, our solution is represented by an interface, not a class, so we cannot create an instance ourselves.

 GenePool<Solution> pool = pf.createPool(200, Solution.class, null);

Here a population of 200 random solutions is created (gene pool). Pay attention to the last parameter - there should be a Map<String, String> for with values of perpertei, if you did not explicitly specify the annotation parameters.

 DefaultHandler handler = new DefaultHandler(new Rating(), null, null, null);

Handler is the embodiment of a genetic algorithm, it is a combination of strategies for mutation, crossing, selection and rating calculation. Of these, it is necessary to independently implement only the rating calculation - for everything else, there are standard implementations that will be used if, instead of the corresponding parameter, null passed.

 handler.iterate(pool, 200);

This line performs 200 iterations of the GA over our population. Since the task is rather primitive, 200 iterations are quite enough to find the minimum of our function. (The search for a solution for more complex problems may require several hours and millions of iterations, fortunately the selection of a solution can be accelerated using multithreading, but more on that later)

 Solution solution = pool.getBestSolution();

Here we get the best solution found. On an instance of this interface, you can freely call getX()/getY() to get the values of the variables you are getX()/getY() for.
If you are not satisfied with the solution, you can continue with the GA iteration until the desired quality of the solution is achieved.

Thus, to solve the problem with the help of EvoJ:

Create an interface with variables
Implement a fitness interface
Create a population of solutions and implement the necessary amount of GA iteration over them using the code above

Well, it was a warm-up, now let's see what EvoJ really can.

Suppose we want to approximate a full-color image with a limited set of polygons. The solution of such a problem cannot be described by a single interface with simple variables; here, the whole power of EvoJ is manifested. We can declare the following set of interfaces:

 public interface PolyPicture { @ListParams(length = "50") List<Polygon> getPolygons(); } public interface Polygon { Colour getColor(); @ListParams(length = "8") List<Point> getPoints(); } public interface Point { @Range(min = "-5", max = "325", strict = "true") float getX(); @Range(min = "-5", max = "245", strict = "true") float getY(); void setX(float x); void setY(float y); } public interface Colour { @MutationRange("0.2") @Range(min = "0", max = "1", strict = "true") float getRed(); @MutationRange("0.2") @Range(min = "0", max = "1", strict = "true") float getGreen(); @MutationRange("0.2") @Range(min = "0", max = "1", strict = "true") float getBlue(); void setRed(float red); void setGreen(float green); void setBlue(float blue); }

At the top, the interface is the whole picture, it consists of a single property - a list of polygons. Each polygon is a color plus a list of points. A point is two coordinates, a color is three components.

Of the new annotations, it is worth mentioning @ListParams - specifying the length of the list and some other properties, and @MutationRange - specifying the maximum mutation radius of a variable, i.e. how much the maximum variable can change in one act of mutation.
Also note that the variables describing the color components have a fixed range of values ( strict=”true” ).

Next, we need to implement a fitness function, as we did last time by expanding the helper class. Below, to save space, an incomplete class code is given (the original code is designed for multithreading and contains many unobvious optimizations that make it difficult to read).

 public class PictureRating extends AbstractSimpleRating<PolyPicture> { public PictureRating(BufferedImage originalImage){....} @Override protected Comparable doCalcRating(PolyPicture picture) { BufferedImage testImage = drawPicture(picture); int[] originalPix = getPixels(originalImage); int[] testPix = getPixels(testImage); return -calcError(originalPix, testPix); } }

Here we draw the picture described by our interfaces, take its pixels and compare it with the pixels of the original image. As a result, we return the accumulated deviation with a minus sign (more error - worse solution, less error - better solution).

Next, we will write code similar to the example where we were looking for the minimum of the function.

 DefaultPoolFactory factory = new DefaultPoolFactory(); GenePool<PolyPicture> pool = factory.createPool(40, PolyPicture.class, null); DemoRating rating = new DemoRating(PICTURE_TO_APPROXIMATE); MultithreadedHandler handler = new MultithreadedHandler(2, rating, null, null, null); handler.iterate(pool, 1000); handler.shutdown();

There is one difference - MultithreadedHandler . It differs from a simple handler in that it iterates the GA over a solution population in several threads (the number of threads is specified by the first constructor parameter). Thus, if your processor has several cores, you can really speed up the search for a solution just by increasing the number of threads. For example, the solution of acceptable quality for the considered problem of image approximation on Core i7 with HyperThreading enabled in 8 threads is in 15-20 minutes.
Multithreaded code in EvoJ is written to minimize blocking, which allows to obtain almost linear performance gains with an increase in the number of threads (for maximum performance, the number of individuals in a population must be a multiple of the number of threads, otherwise some threads will not have enough tasks).

You should also pay attention to the line

 handler.shutdown();

It is needed to complete pending executor threads.

So, this is all I wanted to tell you about EvoJ. As a summary, here are the main features of the EvoJ:

no need to independently vectorize variables, you can immediately write code in terms of the domain domain
standard implementations of the main GA strategies (selection, crossing, mutation)
declarative mutation control (annotations)
multithreading out of the box
small size (120 kb) and no external dependencies (yes, you only need one jar file)
extensibility at any point (you can write your own implementation of any strategy for GA - selection, crossing and mutation)

If you are interested, you can add EvoJ to your Maven project as follows:

First you need to add a link to the repository (I have not yet laid out to the central one)

 <repositories> <repository> <id>EvojHome</id> <name>EvoJ</name> <url>http://evoj-frmw.appspot.com/repository</url> </repository> </repositories>

then direct dependency

 <dependency> <groupId>net.sourceforge.evoj</groupId> <artifactId>evoj</artifactId> <version>2.1</version> </dependency>

Those who are not friends with Maven can download EvoJ directly from the project site http://evoj-frmw.appspot.com/ . There is also a copy of this tutorial in English, as well as detailed technical guidance and the results of using EvoJ for various tasks (approximation of images, training of neural networks).

Thanks for attention. I am pleased to answer your questions.

UPD: Below are examples of what comes out of an approximation.

Source image	Matched image

Source: https://habr.com/ru/post/138276/

All Articles

EvoJ is a convenient framework for genetic algorithms.

More articles: