Merge sort without using additional memory

I thought for a long time that it was impossible to write a merge array sorting so that it did not use additional memory, but so that the operation time remained equal to O (N * log (N)). Therefore, when karlicos shared a link to the description of such an algorithm, it interested me. A search on the network showed that people know about the algorithm, but no one is particularly interested in it, it is considered difficult and ineffective. Although, maybe, they mean some kind of “stable” version of this algorithm, but nobody still needs an unstable one.

But I still decided to try.

')

Merge in linear time

The idea of the algorithm is quite simple. All we need is to merge the two ordered parts of the same array in O (N) comparisons and exchanges of elements. This is done like this:

Choose the number S≈sqrt (N)
Divide the array into K pieces of length S. The rest, not caught in any of the pieces, until we touch
The piece in which the boundary between the ordered fragments fell is changed with the last piece. This will be our clipboard.
Sort K-1 the remaining piece in ascending order of the first elements. Here we need to use an algorithm that is linear in the number of exchanges. Suitable sorting by selecting the minimum element.
We notice two important things. First, if we have two orders of sorted fragments of length A and B and an exchange buffer whose length is not less than min (A, B) , then we can merge these fragments by spending no more than A + B comparisons and A + B + min (A, B) exchanges. The order of elements in the clipboard may change. Secondly, when sorting the array obtained at the previous step from (K-1) * S elements, each element can move no more than S positions to the left, i.e. will never be left of the previous "piece".
Using the clipboard, we sequentially merge pairs of adjacent pieces - [0..S-1] and [S, 2 * S-1] , then [S, 2 * S-1] and [2 * S, 3 * S-1 ] , etc. From the previous paragraph it follows that as a result we get a sorted array of M = S * (K-1) elements
Sort the last R = NM elements of the original array (clipboard + balance) by any algorithm. They recommend some kind of quadratic algorithm (for the purity of the idea, to avoid recursion), but from a practical point of view, the recursive sorting call is not worse.
Merge the sorted array fragments [R..NR-1] and [NR..N-1] , using the fragment [0..R-1] as the clipboard
We are looking for an error in the algorithm at the junction of the previous and next items. If you do not find it, it is explained at the end of the article.
Sort the clipboard: R lower array elements were and remain in it, and after sorting they will be in place

The description is quite long, but you can understand. The number of exchanges per merger is about 5 * N , the number of comparisons is about 6 * N (it depends on the length of the remainder). For complete sorting, these numbers are multiplied by log (N) , and a lot is obtained.

Adaptation algorithm for sorting

To make it easier for us, and the algorithm worked more efficiently, we note the following.

All the fussing with the clipboard and its subsequent merging with the array is needed only if there really wasn’t any free space in the array. If we merge fragments of length A and B , and after them there are at least S cells of unsorted space, then it is enough for us to break the arrays into pieces, sort them out and merge them. This will work especially well if S = A = B.
We must try to ensure that the length of the remainder is zero, and the boundary between the sorted fragments falls exactly on the boundary of pieces of length S. The easiest way to achieve this is if you choose S to be equal to a power of two, and force A and B to be divided into it.
Sorting pieces of arrays requires ((A + B) / S) 2/2 comparisons. Subsequent sorting of the clipboard is O (S * log (S)) comparisons. Therefore, it is not necessary to choose S close to sqrt (N) , you can increase it, for example, to N / log (N) .

Armed with these thoughts, we are writing a program (for now only for an array of type int []). First we sort and merge fragments with lengths equal to the power of two, and we go strictly from left to right so that there is free space to the right. When we reach the clipboard, merge the unfused, but sorted fragments. Sort the clipboard + balance, merge the result with the rest of the array, sort the newly created clipboard - and the array is sorted. It turns out the algorithm is about a hundred lines . Truth told, that he bulky. What about efficiency?

Honestly, I hoped that he would lose the standard qsort no more than 3 times. But the comparison in real conditions (on arrays up to 10 ⁸ in length) showed that, according to the number of comparisons, the algorithm wins qsort by about 10%, and by the total operating time - from 1.2 to 1.3 times! Perhaps this is due to the fact that the icmp function (comparing two integers at given addresses) is substituted inline - but the code that is inserted turns out pretty awful (I checked).

In general, everything is not as bad as they said.

And what kind of error was in the description of the algorithm? The fact is that if an element that, after sorting, must be in one of the first R positions, got into the clipboard or the rest, it will not get into the place. To correct, you have to keep track of what place the element that was in the cell with the index R came to during the last merge, and make the initial sort of the initial fragment to this point (its length can be somewhat larger than R ).

UPD: The algorithm, as I described it here, revealed another error related to "sorting pieces of an array." If there are many identical elements in the source array, different pieces from the same fragment of the array can have the same first elements. And if the order in which these pieces go when sorting is disturbed, the result of sorting the entire array may turn out to be incorrect.

To combat this effect, when sorting, it is necessary to compare the first elements of the pieces, and if they are the same, compare the last elements. And sort the pieces lexicographically by these pairs of elements.

Source: https://habr.com/ru/post/138146/

All Articles

Merge sort without using additional memory

Merge in linear time

Adaptation algorithm for sorting

More articles: