Dear friends, Good day!
This type of optimization algorithms, such as genetic algorithms, has already been discussed more than once. However, there are other ways to find the optimal solution in problems with many degrees of freedom. One of them (and, I must say, one of the most effective) is the method of simulating annealing, which has not yet been described here. I decided to eliminate this injustice and introduce you to this wonderful algorithm in the simplest possible words, and at the same time give an example of its use for solving a simple task.
I understand why genetic algorithms are so popular: they are very figurative and, therefore, easy to understand. In fact, it is easy and extremely interesting to present the solution of a certain task as a real biological process of the development of a population of living beings with certain properties. Meanwhile, the annealing algorithm also has its prototype in the real world (this is understandable from the very name). True, he came not from biology, but from physics. The process simulated by this algorithm is similar to the formation of a crystalline structure by a substance with minimal energy during cooling and solidification, when particles at high temperatures randomly move, slowly slowing down and freezing in places with the lowest energy. Only in the case of a mathematical problem, the role of the particles of matter is played by the parameters, and the role of energy is the function characterizing the optimality of the set of these parameters.
')
To begin with, those who have forgotten, let me remind you what kind of tasks we are trying to solve with such exotic methods. There are a number of problems whose exact solution is extremely difficult to find due to the large number of variable parameters. In general, we have a function F, whose minimum (maximum) we need to find, and a set of its parameters x1..xn, each of which can independently vary within its limits. Parameters and, accordingly, the function can change both discretely and continuously. It is clear that such a function is likely to have a complex dependence on the input data, which has many local minima, while we are looking for a global minimum. Of course, there is always an option to solve the problem by exhaustive search, but it works only in the discrete case and with a very limited set of inputs. This is where optimization algorithms begin to work.
In order not to be dry, I will immediately tell you about the task that I will solve in the framework of this article. Imagine a checkered board of size m by l. The side of the cell is one. On the board, you need to find such an arrangement of n points so that the length of the minimum polyline connecting these points is maximum. Normal minimax task. Obviously, in this case, the function F considers the length of the minimum broken line for a certain arrangement of points, and its parameters are the x and y coordinates of the points, i.e. 2 * n parameters.
Actually, the algorithm
So, there is a function that characterizes the system, and a set of coordinates on which this function is given. First of all, you need to set the initial state of the system. To do this, take just any random state. Next on each kth step we
- we compare the current value of F with the best found; if the current value is better - change the global best
- randomly generate a new state; the probability distribution for it should depend on the current state and the current temperature
- calculate function value for generated point
- accept or not accept the generated state as current; the probability of this decision should depend on the difference between the functions of the generated and current states and, of course, on the temperature (the higher the temperature, the more likely it is to accept a state worse than the current one)
- if the new state is not accepted, we generate another one and repeat the actions from the third item; if accepted, proceed to the next iteration, lowering the temperature (but more often the transition to the next step is performed in any case to avoid a long loop)
The process stops when a certain temperature is reached. The substance has cooled at the point with the minimum energy.
Why do we need such a complex scheme with probabilities of transitions from point to point? Why not just move strictly from more energy to less? The thing is in the already mentioned local minima, in which the solution can simply get stuck. To get out of them and find the minimum global, it is necessary from time to time to increase the energy of the system. In this case, the general tendency to search for the lowest energy is saved. This is the essence of the method of imitation annealing.
And again about the points on the board
Now that the general scheme of the algorithm is clear, let us return to our sheep. We will implement the task in java. To begin with we will describe a board with points on it.
Board:
public class Board { private Point[] points; private int N; private int width; private int height; public Board(int n, int w, int h) { points = new Point[n]; N = n; width = w; height = h; } public Board(Board board) { points = new Point[board.points.length]; for (int i = 0; i < points.length; i++) points[i] = new Point(board.points[i].getX(), board.points[i].getY()); this.width = board.width; this.height = board.height; } public int getPointsCount() { return points.length; } public int getWidth() { return width; } public int getHeight() { return height; } public Point getPoint(int index) { return points[index]; } public void setPoint(int index, Point p) { points[index] = p; } }
And the point:
public class Point { private int x; private int y; public Point(int x, int y) { this.x = x; this.y = y; } public double dist(Point p) { return Math.sqrt((x - px) * (x - px) + (y - py) * (y - py)); } public void move(int dx, int dy, int xBorder, int yBorder){ if(((x + dx) < xBorder) && ((x + dx) >= 0)) x += dx; if(((y + dy) < yBorder) && ((y + dy) >= 0)) y += dy; } public int getX(){ return x; } public int getY(){ return y; } }
Since the order of the points is important for calculating the function we are looking for, we introduce their ordered set:
public class Polyline { public Point p[]; public int N = 5; public Polyline(int Num) { N = Num; p = new Point[N]; } public double dist() { double d = 0; for (int i = 0; i < (N - 1); i++) { d += p[i].dist(p[i + 1]); } return d; } }
Now we need the function F, which searches for the minimal broken line of a given layout and counts its length.
public class MinPolyline { private double bestDist; private Polyline bestMinPolyl; private Board board; public Polyline F(Board b) { N = b.getPointsCount(); board = b; int[] perm = new int[N]; boolean used[] = new boolean[N]; for (int i = 0; i < N; i++) { perm[i] = i + 1; used[i] = false; } bestDist = Double.MAX_VALUE; permute (0, perm, used); return bestMinPolyl; } void permute(int first_index, int[] perm, boolean[] used) { if (first_index == N) { Polyline polyl = new Polyline(N); for (int i = 0; i < N; i++){ polyl.p[i] = board.getPoint(perm[i] - 1); } if (bestDist > polyl.dist()){ bestDist = polyl.dist(); bestMinPolyl = polyl; } return; } for (int i = 1; i < N+1; i++) { if (used[i - 1]) continue; perm[first_index] = i; used[i - 1] = true; permute(first_index + 1, perm, used); used[i - 1] = false; } } }
Well, all the preparations have been made. Now we can begin to implement the algorithm itself.
Algorithm implementation
Returning to the general scheme, we will find there mention of two distributions, depending on coordinates and temperature. They need to somehow define. In addition, we need a law that will change the temperature from iteration to iteration. By selecting these three functions, we will compose a specific scheme of the annealing method. It must be said that there are several modifications of the algorithm, differing from each other by distributions and the law of temperature variation. They have their pros and cons, such as speed, a guarantee of finding a global minimum, the complexity of execution.
I chose a modification for my task called superfast annealing ("Very Fast Annealing"). To begin with, we will determine the distribution of the probability of taking a new state.

or in code:
h = 1 / (1 + Math.exp(-(minPolyl.F(board).dist()- maxDist) / T));
Thus, the probability of accepting a position with a worse result of a function (with higher energy) is lower, the lower the temperature and the greater the difference between the current energy and the optimal one. Next, we establish the law of decreasing temperature. A distinctive feature of ultrafast annealing is that changes in all coordinates are considered independently, and even the temperature for each coordinate is determined by its own:

Where D is the number of coordinates (ie, 2 * n), k is the move number. However, for simplicity, we will do the total temperature:
T = To * Math.exp(-decrement * Math.pow(i, (double)(1) / (2*(double)N)));
As you can see, the temperature will decrease exponentially, and the procedure will end quickly.
Finally, the distribution for the new coordinates. The following value characterizes the relative change of one coordinate:

Where alpha is a value uniformly distributed on the interval [0, 1]. If the new coordinate does not fit into the scope of its change (in our case, it goes beyond the board), then the calculation is performed again.
z = (Math.pow((1 + 1/T), (2 * alpha - 1)) - 1) * T; x = board.getPoint(moveNum).getX() + (int)(z * border);
Now we have all the necessary allocations and functions. It remains only to put everything together. As input data you need to pass the dimensions of the board, the number of points, the initial and final temperature and the damping factor of the temperature exponent.
public class AnnealingOptimizer { public int N; public int width; public int height; public AnnealingOptimizer(int n, int width, int height) { N = n; this.width = width; this.height = height; } public Board optimize(double To, double Tend, double decrement){ Board board = new Board(N, width, height); Board bestBoard = new Board(N, width, height); Random r = new Random(); double maxDist = 0; double T = To; double h; MinPolyline minPolyl = new MinPolyline(); for (int j = 0; j < N; ++j){ board.setPoint(j, new Point(r.nextInt(height), r.nextInt(width))); bestBoard.setPoint(j, board.getPoint(j)); } int i = 1; while (T > Tend){ for (int moveNum = 0; moveNum < board.getPointsCount(); ++moveNum){ NewX(board, moveNum, T, width, true); NewX(board, moveNum, T, height, false); } if (minPolyl.F(board).dist() > maxDist){ bestBoard = new Board(board); }else{ h = 1 / (1 + Math.exp(-(minPolyl.F(board).dist()- maxDist) / T)); if (r.nextDouble() > h){ board = new Board(bestBoard); }else{ bestBoard = new Board(board); } } maxDist = minPolyl.F(board).dist(); T = To * Math.exp(-decrement * Math.pow(i, (double)(1) / (2*(double)N))); ++i; } return bestBoard; } private void NewX (Board board, int moveNum, double T, int border, boolean xCoord) { Random r = new Random(); int x; double z; double alpha = r.nextDouble(); z = (Math.pow((1 + 1/T), (2 * alpha - 1)) - 1) * T; if (xCoord){ x = board.getPoint(moveNum).getX() + (int)(z * border); if ((x < border) && (x >= 0)) { board.getPoint(moveNum).move((int)(z * border), 0, width, height); } else { NewX(board, moveNum, T, border, xCoord); } } else { x = board.getPoint(moveNum).getY() + (int)(z * border); if ((x < border) && (x >= 0)) { board.getPoint(moveNum).move(0, (int)(z * border), width, height); } else { NewX(board, moveNum, T, border, xCoord); } } } }
It should be noted that some modifications of the annealing imitation method provide a statistical guarantee for finding the optimal solution. This means that with the right choice of parameters, the algorithm will find the optimal solution in a reasonable time without going through all the inputs. This implementation finds the answer for a 12x12 board and 5 points in approximately 15,000 iterations. Let me remind you that the total number of placement options is 12 ^ 10. Obviously, the difficulties of setting parameters are worth such a win. By the way, the method of annealing in terms of speed and accuracy at least does not lose to the genetic algorithm, but more often it is ahead of it.
Finally, a visual result of the algorithm.

References:
A. Lopatin "Annealing Method"