
When it comes to efficient sorting algorithms, the erudite habrauser will immediately recall the unfading “
quick sort ”, the new-fashioned “
Tim sort ”, the legendary “
merge sort ” and even the tricky “
introspective sort ”.
Without questioning the effectiveness of the above methods, I suggest you sorting, which, under certain input conditions, easily does any other algorithm in terms of speed.
We are talking about
FlashSort , which handles very large arrays with evenly distributed data.
The method in 1998 was presented by the German scientist
Karl-Dietrich Neubert . His scientific activity is devoted not so much to the theory of algorithms as to solid state physics, biophysical systems, and elementary particle physics. Analyzing the results of experiments, Neubert came to the conclusion that it is logical to sort large arrays of statistical data by resorting to the theory of probability.
')
The essence
The principle of operation is easy to explain with a specific example.
Suppose there is a large array of
n elements whose values range from
1 to
100 . If we meet an element with a value of
50 , then it is reasonable to assume that its rightful place is in the middle of the array. The situation is similar with other elements:
5 , probably, should be closer to the beginning of the structure,
95 it is appropriate to push almost to the end. As a result of such careless manipulations, we quickly obtain an
almost sorted array .
Having scrambled up the elements, it remains to apply some method that quickly complements the unordered (
sorting inserts fit perfectly).
Matan
The main task is reduced to the fact that all the elements of the array, depending on their values, need to be distributed into several classes. The smallest numbers are in the first class, the largest - in the last, the remaining numbers - in intermediate groups.
If we take
m for the number of classes and also know the minimum
A min and maximum
A max elements in the array, then the class number for the element
A i is calculated using the following formula:

From the point of view of probability theory, class
K is nothing more than a
quantile indicating the place of the element
A i in our distribution model.
Square brackets in the expression mean the integer part. Classes will thus be renumbered from
1 to
m . In the additional array
Q [ 1..m
], the position
Q [ K
] contains the number of elements from the main array belonging to the corresponding class.
Then, using an additional array, we redistribute the elements of the main array, inserting them into
almost our places. Algorithmic nuances - in the code of the java-program below.
Visualization
Duration more than 2 minutes. It will be clear why the sorting is dubbed "flickering."

Time effectiveness
As already mentioned, the algorithm is very effective on large arrays with uniformly distributed data. In this case,
FlashSort with an average time complexity
O (n) is significantly ahead of both
QuickSort and other efficient sorting methods.
In other situations, things are not so brilliant. On short arrays, on long arrays with data that are not evenly distributed in size, the “flashing sorting” shows at least acceptable, but much more modest results.
It also does not make sense to apply the main algorithm to non-fully ordered arrays, it is easier to go straight to the
sorting by inserts . An almost sorted array for
FlashSort is a degenerate case, and in particularly unsuccessful situations, the worst time complexity is
O (n 2 ) .
Memory efficiency
The method is not very memory demanding, although it requires resources for a new array, the number of elements in which is the number of classes. The question is rather actual: what
m to take? The more classes - the better the distribution will be, but the higher the price in the form of additional memory. In most cases, the acceptable ratio "price / quality" gives the formula
m ≈ 0.42 n ,
derived by Neubert empirically.
If the range of possible values of the elements of the main array is small enough, then as
m it is possible (and even desirable) to take the whole distance between the minimum and maximum elements. In such cases, if the condition
“1 value = 1 class” is observed, the speed of sorting over time achieves the best result
O (n) at insignificant expenses for additional memory.
Implementation
FlashSort in Java.import java.util.Arrays; public class FlashSortAlgorithm extends SortAlgorithm { void sort(int[] a) throws Exception { FlashSort(a);
The code is taken
from here , my comments.
A collection of implementations on various YPs (Ada, C, Basic, Fort, Fortran, Java, Pascal) can be found on
the Professor’s website .
Algorithm Characteristics
Title | FlashSort (Flashing sort; Flashing sort; Flashing sort) |
---|
Author | Karl-Dietrich Neubert |
---|
Year of publication | 1998 |
---|
Class | Sort by distribution |
---|
Resilience | Unstable |
---|
Comparisons | No comparisons |
---|
Time complexity | the worst | O (n 2 ) |
---|
average | O (n + m) |
---|
the best | O (n) |
---|
Memory difficulty | Total | O (n + m) |
---|
additional data | O (m) |
---|
Links
FlashSort in English WikipediaKarl-Dietrich Neubert, personal site.FlashSort on Dr. Dobb's journalJava visualization