πŸ“œ ⬆️ ⬇️

Optimization of combinatorial algorithms

I do not know whether it was worth making a separate note on the optimization of already published algorithms or just needed to add the revised code to the old article . I decided that the new one would be more interesting. I must immediately say that this note is not intended for professional programmers. This, on the one hand, will go about accepting the optimization of the C code for generating all the permutations, on the other hand, about the visible speed improvements that have been obtained compared to the C code from my dusty article . The main task: to explain some of the techniques for reducing the code to non-specialists who have to deal with algorithms.

About the first

In the algorithm for generating all permutations after exchanging values ​​in an array, it is necessary to wrap a part of this array β€” the tail. In the first implementation, 4 fairly costly techniques are used for this, which boil down to: a) splitting an array into two - 2 operations, b) turning over one of the resulting arrays. c) gluing arrays into one.
')
If you express it, for example, in PHP, you get the following construction:

$a=array_merge(array_reverse(array_slice($a, 0, $i)),array_slice($a, $i)); 

If the reader got acquainted with the article by reference, then he probably noticed that this line of code is in fact a complete analog of the operations that are used in the C code.
However, there the operations are separated in the function, which is very confusing.

This PHP expression is also quite difficult to read (this does not apply to haskell programmers), but it has one important plus - this is unambiguity in understanding the necessary actions for optimization. After a short meditative contemplation of this line, it begins to be thought of as a single operation, for which you can try to find a simpler analogue, and most importantly a faster one. For PHP, I got the following:

 $c=$a; $z=0; for ($i-=1; $i > -1; $i--) $a[$z++]=$c[$i]; 

I had to enter into the algorithm another copy of the array and a relatively simple loop to redefine the part of the array, and one variable z was also added.

I will comment on this section: 1) after the exchange of elements, the array C is equal to A; 2) the for loop starts from the index on which the exchange is performed (i) to the zero index, i decreases.

The variable z, on the contrary, increases and parts of array A are assigned elements from array C, but in the reverse order. Thus we get the desired result - array A with an inverted part. In the implementation of the variable i is subtracted 1, so as not to go beyond.

We obtained an optimization method of three steps, which consists of: 1) encoding the full algorithm, i.e. how it is conceived and displayed on paper, with all redundant operations; 2) in the search and reduction of some disparate operations in one line 3) to search for more simple analogues for operations in this line, if possible.

The C code is quite compact:

Watch
 #include <stdio.h> #include <string.h> int main() { char a[] = "4321"; char d[5]; int fact = 24; int i, j, z; int y=0; char c; while (y != fact) { printf("%s\n", a); i=1; while(a[i] > a[i-1]) { i++; } j=0; while(a[j] < a[i]) { j++; } c=a[j]; a[j]=a[i]; a[i]=c; strcpy(d, a); z=0; for (i-=1; i > -1; i--) a[z++]=d[i]; y++; } } 


It turned out 4 cycles and the exit condition - when the variable reaches the factorial of the number of all permutations for n. I deliberately threw away the comparison of arrays to speed up the work a bit.

Update
As correctly noted in the wataru comments, the construction of the turn of a part of the array can also be replaced by a simpler one, as a result, we have code in 4 cycles on C only using the main library:

Edited code
 #include <stdio.h> int main() { char a[] = "4321"; int fact = 24; int i, j; int y=0; char c; while (y != fact) { printf("%s\n", a); i=1; while(a[i] > a[i-1]) i++; j=0; while(a[j] < a[i])j++; c=a[j]; a[j]=a[i]; a[i]=c; i--; for (j = 0; j < i; i--, j++) { c = a[i]; a[i] = a[j]; a[j] = c; } y++; } } 



About speed

Earlier, I compared my first implementation with a recursive algorithm using this link and the result was:

The recursive algorithm gave the running time for n = 11:

real 2m9.213s
user 0m2.920s
sys 0m26.290s

The algorithm from the first article issued for n = 11:

real 2m15.510s
user 0m19.750s
sys 0m34.300s

Current version for n = 11

real 1m46.459s
user 0m3.130s
sys 0m24.000s

Source: https://habr.com/ru/post/330690/


All Articles