📜 ⬆️ ⬇️

Optimization by example. Ant algorithm (ACS) vs Annealing Method. Part 2

I continue the series of articles “Optimization by Example”. This article compares two heuristic algorithms on the hackneyed symmetric traveling salesman problem. Today, we delve deeper into this topic and analyze a specific modification of the ant algorithm.



This article does not prove which method is better in general, since both algorithms have many modifications as well as many potential modifications. Here, only the winner is determined on a specific modification and on certain parameters. The purpose of this article is to expand the readers ’outlook in the field of discrete optimization, and, I hope, suggestions for improving the work of both algorithms.

The modification of the ant algorithm was chosen by ACS (Ant Colony System), which was proposed by Marco Dorigo and Luca Gambardella in 1997. The main differences from AS (Ant System) are:
')
1) a balance is set between the most attractive city, and the choice as in an ordinary AS

arg max {[τ (r, u)] ^ α * [ω (r, u)] ^ β} if q <= q0 (Formula 1)
u ϵ Jk "r"

In the opposite case, select the transition to AS

where [τ (r, u)] is the pheromone level on the edge (r, u), [ω (r, u)] is the inverse weight of the distance on the edge (r, u), β is an adjustable parameter than it is higher however, the algorithm will be inclined to choose a city that has a smaller distance, α is equal to 1, q is a randomly chosen number, q0 is the probability of choosing that the transition from one vertex to another will follow the formula 1, u are cities not yet visited

2) besides the global update of pheromones, a local one also occurs. The level of pheromones changes as each ant passes at an iteration (closer to the natural habitat of the ants)

τ (r, s) = (1-p) * τ (r, s) + p * τ0 (Formula 2)

where p is the local update level, τ0 = the value of the initial pheromone, which is calculated as follows: τ0 = (n * Lnn) ^ - 1, where Lnn is an approximate optimal value that can be obtained by another optimization method.

3) when the pheromone is updated globally, the addition occurs only to the edges, either the best ones since the start of the algorithm (global best), or to the best edges on the iteration (local best). I applied to the edges of the global best.

τ (r, s) = (1-e) * τ (r, s) + e * (Lbest ^ -1) (Formula 3)

where e is the global update level, Lbest is the best route length (shortest), either at the k-th iteration, or global best.

These modifications gave significant performance to the algorithm. The annealing method is the same as in the previous article, except for an increase in speed. Thanks to the reader zartdinov for a simple and very effective proposal to increase the speed.

Now we will compare the tasks on known coordinates, such as Oliver30, Eli51, Eli101, on which the best solution is found. We also derive approximate formulas for the time dependence of the two algorithms on the number of cities, and, finally, we will try to determine the winner for today, taking into account all these factors.

Let's start with the task Oliver30 the best solution - 423,7406

ACS Parameters:


* - variable parameters from the dimension of the task

A few words about the number of ants. It would be a mistake to believe that the more ants the better. With a large number of ants in the colony productivity drops significantly. Firstly, the running time of the algorithm is greatly increased. Secondly, there is a surplus of pheromones, which leads to an analogy in the natural environment, called the circle of death of ants. Thus, the algorithm gets stuck at a local optimum.

There is no exact determination of the number of ants in a colony, there is an approximate calculation as in [1]. Immediately I tried to find a point of optimum at which an increase in the number of ants did not improve the solution, and their reduction reduced the result. Also, if simplified, I experimented with one ant colony at certain iterations. As soon as the average level of pheromone began to rise - it took it as the basis for the optimal number of ants.

So the results of the Oliver30 task:


Pair of charts:

The fourth graph shows that the algorithm continues to look for alternative ways, not stopping at what has been accomplished. With an increase in the number of ants up to 50-100, the spread of the average generation distance decreases within the range of 20-30, which leads to sticking.

A complete search for the symmetric traveling salesman problem is (n-1)! / 2 or
4 420 880 996 869 850 977 271 808 000 000 for this task

100% result, excellent work ACS

Let's look at the annealing method.

Options:


* - variable parameters from the dimension of the task

Results:


In both time and quality, in 30 cities wins by a large margin of ACS. Tested two algorithms not only on this task, but also on the other 30 - ACS certainly wins the simple method of annealing.

Now the challenge for Eil51 is 51 cities, the best solution is 426, 7000 iterations, number of ants 9


Pair of charts:

On the 4th graph, a regression line is added. In general, the Eil51 problem in many foreign studies has been tested on a much larger number of iterations. Maybe that’s why the global optimum was not found; frankly, it also tested at large iterations and the maximum that was found was 427.

Taking this opportunity, let's look “in real mode” at the change of pheromones from the number of iterations, I rather liked this picture.

Without global update and with the addition of a total evaporation rate as in AS, for greater effect.

image

We look at the annealing method at 2 500 000 iterations


Quite a good result, but still ACS is still ahead.

Now the task of Eil101 is 101 cities, the best distance is 629, 9000 iterations, the number of ants is 11


Pair of charts:

Here, there are 9000 iterations, quite a few for the 100th problem, however, we compare this result with annealing at the same time interval at 7,000,000 iterations.


Quite a stable result, better than ACS showed. But ACS allows you to make the search more manageable than the usual annealing (although in the latter one could at least introduce a stop criterion, but more on that in the following articles). Therefore, today up to 100 vertices in all parameters wins ACS. Moreover, it is most likely that the annealing method will not be greatly accelerated, while it is still possible to optimize the ACS code. (In this case, the code is poorly optimized).

I repeat that the main plus of ACS (as well as other modifications of the ant algorithm), in searching for a global optimum, with an infinite number of iterations, the probability of finding the global best tends to 1. The question is of course in time. Fast, but about, or long, but for sure. Therefore, I propose to construct the dependence of time (for identical numbers of iterations on each algorithm) on the number of cities.

The number of ACS iterations is 10,000, ants - 10;
The number of iterations of SA is 7,000,000.


Wow! We see that the annealing method takes constant time, regardless of the number of vertices. By constructing a regression, we determine the time complexity for ACS.

t = 0.0044939x ^ 2 + 0.72097x + 3.8225 (Formula 4)

where x is the number of cities, t is the ACS runtime

If the two algorithms up to 100 vertices were approximately equal in both time and quality (with a slight advance in ACS), then we can very roughly assume that in 1,000 cities, 10,000 iterations per ACS and 7,000,000 iterations per SA result should be similar.

Check it out.

ACS 200 cities (random cities), time - 311.54 s.


SA 200 cities (the same as above), time 103.19 s.


Launched sequentially. What is the probability that both will show the same result? An interesting moment came, maybe even hundredths of the match? But this is no longer known to you or me)

In general, a series of tests, and 300 vertices come out about the same result, with an increase above, plus the annealing method leaves.

With a time limit and the number of vertices more than one hundred, the simple method of annealing is better than ACS. I repeat that it is ACS, not MMAS, ACS local search, or another modification.

But sometimes you need to find the global optimum, there are few who can contend with ant algorithms.

Now a few words about the acceleration of the method of imitation annealing.
As the daiver19 reader suggested , then, of course, you should not recalculate the route at each iteration.

There is a conditional route:

1 2 3 4 5 6 7 8 9

Randomly selected two numbers, let it be (2.5)

Now it is enough to calculate the distance (1.2) and (5.6) and calculate the distance (1.5) and (2.6)

However, this will not work on an asymmetric task.

Due to this, the number of cities does not affect the execution time of the algorithm, plus it removed the functions fliplr, which took a decent amount of time. Arrays of random numbers were also formed in advance. All this at times increased the speed of the algorithm with the previous article.

Also, one of the readers would be interested in the result when rearranging two vertices, rather than inverting the path between them. Let's see the result on 101 cities of Eil101 per 1,000,000 iterations.


Inversion path is much better.

I would like to show and tell a lot of things, but the article is also quite long. In the following publications we will “fiddle” with the annealing method, we will try to make it more manageable. We may also consider other modifications of the ant algorithm, dive a little deeper, and then move on to genetics.

Now I propose to see how the undisputed leader (so far up to 100 peaks) goes to the global best path.


Also, I propose to look at the leader in time and number of SA peaks, which gives an approximate solution of 1000 peaks in 4,000,000 iterations in 34 seconds. If in the last article 2,000,000 iterations for 500 vertices was 90 seconds, now it is only 14.6!


Something like this. Sources with comments attached. I tried to keep a balance between the readability of the code and speed. I advise you to review them, even for those who are not at all familiar with MatLab, as this will greatly help to get into the essence of the algorithms.

Imitation annealing. Full code with comments
%   (     ) %-------------------------------------------------------------------------- tic % clearvars -except cities clearvars % ----------------------------------------------------------------- % -  m = 1000000; % ---------------------------------------------------------------- %   Tstart = 100000; %   Tend = 0.1; %     T = Tstart; %  S = inf; %   n = 500; %  ? g = 1; % ------------------------------------------------------------------- %   dist = zeros(n,n); % ------------------------------------------------------------------------- %   (x,y) cities = rand(n,2)*100; %      RANDONE = rand(m,1); %      D = randi(n,m,2); %    ROUTE = randperm(n); %    for i = 1:n for j = 1:n % dist (  ) dist(i,j) = sqrt((cities(i,1) - cities(j,1))^2 + ... (cities(i,2) - cities(j,2))^2); end end %  ,   -  for k = 1:m %    Sp = 0; %     , ROUTEp - %   %   ROUTEp = ROUTE; %    transp = D(k,[1,2]); %    ,      . if transp(1) < transp(2) if transp(1) ~= 1 && transp(2) ~= n S = dist(ROUTE(transp(1)-1),ROUTE(transp(1))) + ... dist(ROUTE(transp(2)),ROUTE(transp(2)+1)); elseif transp(1) ~= 1 && transp(2) == n S = dist(ROUTE(transp(1)-1),ROUTE(transp(1))) + ... dist(ROUTE(transp(2)),ROUTE(1)); elseif transp(1) == 1 && transp(2) ~= n S = dist(ROUTE(end),ROUTE(transp(1))) + ... dist(ROUTE(transp(2)),ROUTE(transp(2)+1)); end else if transp(2) ~= 1 && transp(1) ~= n S = dist(ROUTE(transp(2)-1),ROUTE(transp(2))) + ... dist(ROUTE(transp(1)),ROUTE(transp(1)+1)); elseif transp(2) ~= 1 && transp(1) == n S = dist(ROUTE(transp(2)-1),ROUTE(transp(2))) + ... dist(ROUTE(transp(1)),ROUTE(1)); elseif transp(2) == 1 && transp(1) ~= n S = dist(ROUTE(end),ROUTE(transp(2))) + ... dist(ROUTE(transp(1)),ROUTE(transp(1)+1)); end end %------------------------------------------------------------------------- if transp(1) < transp(2) ROUTEp(transp(1):transp(2)) = ROUTEp(transp(2):-1:transp(1)); if transp(1) ~= 1 && transp(2) ~= n Sp = dist(ROUTEp(transp(1)-1),ROUTEp(transp(1))) + ... dist(ROUTEp(transp(2)),ROUTEp(transp(2)+1)); elseif transp(1) ~= 1 && transp(2) == n Sp = dist(ROUTEp(transp(1)-1),ROUTEp(transp(1))) + ... dist(ROUTEp(transp(2)),ROUTEp(1)); elseif transp(1) == 1 && transp(2) ~= n Sp = dist(ROUTEp(end),ROUTEp(transp(1))) + ... dist(ROUTEp(transp(2)),ROUTEp(transp(2)+1)); end else ROUTEp(transp(2):transp(1)) = ROUTEp(transp(1):-1:transp(2)); if transp(2) ~= 1 && transp(1) ~= n Sp = dist(ROUTEp(transp(2)-1),ROUTEp(transp(2))) + ... dist(ROUTEp(transp(1)),ROUTEp(transp(1)+1)); elseif transp(2) ~= 1 && transp(1) == n Sp = dist(ROUTEp(transp(2)-1),ROUTEp(transp(2))) + ... dist(ROUTEp(transp(1)),ROUTEp(1)); elseif transp(2) == 1 && transp(1) ~= n Sp = dist(ROUTEp(end),ROUTEp(transp(2))) + ... dist(ROUTEp(transp(1)),ROUTEp(transp(1)+1)); end end %-------------------------------------------------------------------------- if Sp < S ROUTE = ROUTEp; iter = k; else %    P = exp((-(Sp - S)) / T); if RANDONE(k) <= P ROUTE = ROUTEp; end end %   T = Tstart / k; %    if T < Tend break; end; end %   citiesOP(:,[1,2]) = cities(ROUTE(:),[1,2]); plot([citiesOP(:,1);citiesOP(1,1)],[citiesOP(:,2);citiesOP(1,2)],'-r.') msgbox ('!') %   clearvars -except cities ROUTE S iter %   toc 



Ant algorithm. Full code with comments
 %   (     ) % ------------------------------------------------------------------------- tic % clearvars -except cities clearvars % ----------------------------------------------------------------- % -  (  ) age = 2000; % -    countage = 10; % -  n = 50; % ----------------------------------------------------------------- %  -  ,  0     %   a = 1; %  -  ,  0  %      b = 2; %  ,  e = 0.1; %  ,  p = 0.1; %    Q = 1; %        AS q = 0.9; %   ph = Q/(n*2000); % ------------------------------------------------------------------- %   dist = zeros(n,n); %    returndist = zeros(n,n); %       ROUTEant = zeros(countage,n); %       DISTant = zeros(countage,1); %       bestDistVec = zeros(age,1); %    bestDIST = inf; %   ROUTE = zeros(1,n+1); %     (    ) RANDperm = randperm(n); %   P = zeros(1,n); %    val = zeros(1); %    getcity = zeros(1); %     indexP = zeros(1); %  minDISTiterration = zeros(1); % ------------------------------------------------------------------------- %   (x,y) cities = rand(n,2)*100; %    tao = ph*(ones(n,n)); tao(logical(eye(size(tao)))) = 0; %        for i = 1:n for j = 1:n % dist (  ) dist(i,j) = sqrt((cities(i,1) - cities(j,1))^2 + ... (cities(i,2) - cities(j,2))^2); % nn (   ) if i ~= j returndist(i,j) = 1/sqrt((cities(i,1) - cities(j,1))^2 + ... (cities(i,2) - cities(j,2))^2); end end end %  for iterration = 1:age %  (  ) for k = 1:countage % ******************    ****************** %    %     ROUTEant(k,1) = randi([1 n]); %       (   ), - %   -       % ROUTEant(k,1) = RANDperm(k); %           1- % ROUTEant(k,1) = 1; %       ,    %  ,     ,      %   % if iterration == 1 % ROUTEant(k,1) = randi([1 n]); % % ROUTEant(k,1) = RANDperm(k); % % ROUTEant(k,1) = 1; % else % ROUTEant(k,1) = lastROUTEant(k); % end % ********************************************************************* %   ,   ,     for s = 2:n %     ir = ROUTEant(k,s-1); %    (  ) ,     % : tao^a*(1/S)^b % 1/S - returndist. %      (-  *  %  * - ) ,       , %      .    %   .     : % for c = 1:n % P(1,c) = tao(ir,c).^a * returndist(ir,c).^b; % end P = tao(ir,:).^a .* returndist(ir,:).^b; %   (     k- ) %  n ,      ,   %  %     ,   ,  %    0,     %       P(ROUTEant(k,1:s-1)) = 0; %       RANDONE = rand; if RANDONE <= q [val, getcity] = max(P); else %    (     = 1 ) P = P ./ sum(P); getcity = find(cumsum(P) >= RANDONE, 1, 'first'); end %  s-    k-  ROUTEant(k,s) = getcity; end %   k-  ROUTE = [ROUTEant(k,1:end),ROUTEant(k,1)]; %   S = 0; %   k-  for i = 1:n S = S + dist(ROUTE(i),ROUTE(i+1)); end %  k- ,   k-  age-  DISTant(k) = S; %     S if DISTant(k) < bestDIST bestDIST = DISTant(k); bestROUTE = ROUTEant(k,[1:end,1]); iter = iterration; end %  ""  k-  (    %      ,    %  ) % lastROUTEant = ROUTEant(1:end,end); %   ,    for tL = 1:n xL = ROUTE(tL); yL = ROUTE(tL+1); %    tao(xL,yL) = (1-p)*tao(xL,yL) + p*ph; tao(yL,xL) = (1-p)*tao(yL,xL) + p*ph; end end % -------------------------- -------------------------- %   ""   -   tao(tao < 2.500000000000000e-150) = 2.500000000000000e-150; %    for t = 1:n xG = bestROUTE(t); yG = bestROUTE(t+1); %    tao(xG,yG) = tao(xG,yG) + e*(Q/bestDIST); tao(yG,xG) = tao(yG,xG) + e*(Q/bestDIST); end end %   citiesOP(:,[1,2]) = cities(bestROUTE(:),[1,2]); plot([citiesOP(:,1);citiesOP(1,1)],[citiesOP(:,2);citiesOP(1,2)],'.r-') disp (num2str(bestDIST)) msgbox ('!') clearvars -except cities bestDIST bestROUTE iter toc 



Thanks for attention. Until new meetings.

Articles of foreign authors:

[1] - M. Dorigo, LM Gambardella, Ant Colony System: A Cooperative Learning Problem // IEEE Transactions on Evolutionary Computation Vol. 1, 1, pp. 53-66, 1997

[2] T. Stützle, H. Hoos, “IEEE International Conference on Evolutionary Computation,” p. 309-314, 1997.

[3] T. Stützle, M. López-Ibáñez, P. Pellegrini, M. Maur, M. de Oca, M. Birattari, Michael Maur, M. Dorigo, “Parameter Adaptation in Ant Colony Optimization” // Technical Report, IRIDIA, Université Libre de Bruxelles, 2010

Source: https://habr.com/ru/post/314056/


All Articles