📜 ⬆️ ⬇️

Young's Binary Tables

So, as promised , the continuation of the topic of Jung tables. Let me remind you that the Young table refers to a numerical matrix that has some special properties. A matrix is ​​a two-dimensional array. And here a natural question should arise - why should the array be two-dimensional? And what if we try to implement a three or four dimension table on the same principles , and best of all, of course, five stars ! About where such a generalization will lead us, you can read under the cut ...

One-dimensional young table


Before we take the dimension of Jung's table to infinity, let's see what remains behind us "behind our backs". Only one is less than two (not counting zero, but the Young dimension of the zero dimension is something like a null space - no one knows what it is, but it sounds nice).
So, Jung's one-dimensional table - what is it? Following the definition given in the previous topic, this is just a descending, one-dimensional array, all of whose non-empty cells are located at its beginning.

The rise and descent of elements (basic operations on Young table) in this case are reduced to a linear view of the table from right to left (ascent) or from left to right (descent):

def MoveUp(X, i): while i>0 and X[i]>X[i-1]: X[i], X[i-1] = X[i-1], X[i] i -= 1 def MoveDown(X, n): i = 0 while i+1<n and X[i]a>X[i+1]: X[i], X[i-1] = X[i-1], X[i] i += 1 


The sorting algorithm itself is almost the same as the one described here .
 def YoungSort(X, n): for i in range(1,n): MoveUp(X,i) for i in range(1,n): X[0], X[ni] = X[ni], X[0] MoveDown(X,ni) 

The attentive reader will notice that the first loop in this sorting is nothing more than a standard insert sorting. True, the result is a sequence in descending order. Therefore, the second cycle performs a tricky reverse reversal of the sequence. It is clear that this treatment could be made more efficiently, but now our goal is to try to generalize the concept of Young table to the case of arbitrary dimension, so the optimization of one particular case (and even such a simple one) does not make sense.

Generalized Young Tables


For the definition of a generalized Young table, we take the following: a Young table is a partially ordered almost filled numerical array of dimension d.
To determine the properties of partial ordering and almost fullness, we introduce the concept of the upper neighbor of the cell (element) of the array. Let some cell of the array be characterized by the indices i 1 , i 2 , ..., i d . Then, the top neighbor of a given cell is any cell whose all indices, with the exception of one, are equal to the indices of that cell, and the remaining index is exactly one less than the corresponding index of the given cell. For example, a cell with indices [1,0,5,3] has three upper neighbors: [0,0,5,3], [1,0,4,3] and [1,0,5,2]. Thus, if all indices of a cell are non-zero, then it has exactly d upper neighbors, and if all indices are zero, then there are no neighbors above — such an element is called the vertex of the Young table.
Then, partial ordering means that the value of any element of the Young table is not greater than the values ​​of all its upper neighbors in this table.
Almost full means that if some cell of the Young table is filled (that is, it contains some value), then all its upper neighbors are also filled.
It is easy to show that along each dimension (when all indices, except one, are fixed), the values ​​of the cells of the Young table form a non-increasing sequence. As a result, the largest element of the table is always located at its top.
Unfortunately, the visualization of the already three-dimensional Jung tables is fraught with considerable difficulties, so the reader is invited to turn on his multidimensional imagination, helping him (the imagination), if necessary, with pictures of the usual two-dimensional Jung tables.
The lifting and lowering operations are summarized in an obvious way. When lifting some element, all its neighbors are moved from above, if neighbors with smaller values ​​are found, then the smallest one is selected, it changes places with the current one and the process continues for the new current element. If the values ​​of all the upper neighbors are greater than the current, then the rise is complete. An example of lifting an element in a three-dimensional table of Young is shown in the figure (only a fragment of the table is shown).

The descent is performed in the same way, but all the neighbors from the bottom (the cells for which the current is the upper neighbor) are selected and the largest one is selected from them.
Using these two operations, you can implement the insertion of a new element into the table and the removal of the largest element from the table. And already by means of an insertion and removal sorting is simply implemented. The implementation of the ascent and descent, in principle, should not cause any particular difficulties even for d-dimensional tables, but, nevertheless, we first try to assess the complexity of these two operations.
We assume that each index of the d-dimensional Young table varies from 0 to m-1. Take the worst case - the table is filled to the end (the number of elements n = m d ), you want to raise the last of its elements, which (as it happens) is the greatest. Those. at the end of the climb, this element should be on top of the table. Because in the process of lifting, at each iteration exactly one index of the element decreases by one, then we have to perform about dm iterations. At each iteration, one exchange is performed, and no more than d comparisons (when viewing neighbors). So, one lifting operation (in the worst case) will cost us O (d 2 m) actions. This means that n such operations will cost O (d 2 mn) actions. Taking into account the fact that n = m d , we obtain the following estimate of the complexity of sorting using the d-dimensional Young table: O (d 2 n 1 + 1 / d ).
The good news is that as d increases, the exponent 1 + 1 / d: 2 (one-dimensional Young table), 1.5 (standard Young table), 1.33, 1.25, 1.2, etc., decreases in the limit of this all - 1 - will we really beat quick sorting ?! Now the bad news is that as d increases, so does the coefficient over a given degree. It is easy to show (using the derivative) that the optimal choice of the parameter d is approximately equal to log 2 n. We substitute this value into the formula O (d 2 n 1 + 1 / d ) and after simple transformations we get the final estimate O (nlog 2 n) - everything, the revolution is canceled, because the obtained estimate is asymptotically slightly, but worse than the optimal linear-logarithmic ...

Young's Binary Tables


Although we have shown that even at best, sorting with the help of Young tables does not reach asymptotic optimality, we still try to implement the best of these tables, i.e. a table of dimension d = log 2 n, which we will call the binary Young table. Even if such a table cannot be used to organize efficient sorting, nevertheless, it is possible that someone will be able to find useful use for it. And among other things, the implementation of Jung's binary table is simply a good programming exercise.
First let's formalize the concept of a binary Young table. We will call the binary Young table of dimension d, in which each index can take only two values ​​0 and 1. If we denote by n the capacity of such a table, we get n = 2 d or d = log 2 n. Examples of such tables for small values ​​of d are shown in the following figure. Yeah, an astute reader will say, but this is a hypercube of dimension d! And it will be absolutely right.

Now let's deploy such a hypercube into a linear array using the standard layout of multidimensional arrays in memory — the lowest index is the fastest changing. For example, consider the dimension table d = 3 from the previous figure.

What we see is that the indices, treated as numbers in the binary number system, form the sequence 0, 1, 2, 3, etc. That is, we obtained a simple algorithm for transforming the coordinates of a cell in a binary Young table to the coordinates of the same cell in a linearized table. Well, we can work with bats! It seems we can ...
The key concepts of the Young table are the upper and lower neighbors of the cells. In terms of the binary Young table, these concepts are defined as follows: a cell with index i (in a linearized table) is called the upper neighbor of a cell with index j if the bit representation of the number i matches the bit representation of j, except for exactly one bit, which in j is equal unit, and in i - zero. Accordingly, cell j in this definition will be the lower neighbor of cell i. For example, the cell with the number 6 = 110 2 is the upper neighbor of the cell with the number 7 = 111 2 , and the lower neighbor of the cell with the number 2 = 010 2 .

Lifting and lowering elements


Lifting and lowering operations are defined in the same way as they were defined above for generalized tables. To implement these operations for binary tables, by virtue of the definition of upper and lower neighbors just made, we need an efficient algorithm for iterating over all single (ascent) or zero (descent) bits of a given integer number. Especially for those who are too lazy to think in this direction, there is a wonderful book by Henry Warren “Algorithmic Tricks for Programmers”, where there are formulas of interest to us. So, to reset the rightmost bit of the number x, the operation x & (x-1) is used. To replace the extreme zero from the right by one, the analogous operation x | (x + 1) is used. These operations (plus a few dances with a tambourine) are enough to organize the enumeration of all units (zeros) in a given integer.

Implementation


Below is the implementation of lifting and lowering the elements for a binary Young table represented by the vector X.
 def MoveUp(X, i): while True: t = i j = i while j: #       i k = j-1 l = k&i if X[t]>X[l]: t = l #     j = k&j if i==t: return #   X[i], X[t] = X[t], X[i] #   i = t 

When descending, the neighbors are sorted in ascending order of their numbers, so when a first empty neighbor is detected (with a number larger than the table size), the search ends.
 def MoveDown(X, n): i = 0 while True: t = i j = i while True: #      k = j+1 l = k|i if l>=n: break #     -  if X[t]<X[l]: t = l #     j = k|j if i==t: return #  X[i], X[t] = X[t], X[i] i = t 

The sorting itself is done according to the standard scheme
 def YoungTableauSort(X, n): for i in range(1,n): #    MoveUp(X,i) for i in range(1,n): #   X[0], X[ni] = X[ni], X[0] MoveDown(X,ni) 

Instead of conclusion


So, what have we achieved? Someone will say that nothing special, and will be right in their own way. Sorting clearly does not reach the optimum, losing, for example, pyramidal sorting, based on similar principles. For the same reasons, the implementation of the ADT Queue with priority based on binary Jung tables is inferior to the same implementation using pyramids. On top of that, the transition from standard tables to binary ones was probably the only advantage of Young tables over the pyramids - the search for an arbitrary element. I recall that in two-dimensional tables, such a search is performed in O (n 1/2 ) time, but in binary tables, the search for an arbitrary value will require an almost complete enumeration of the entire table, i.e. linear in n time. On the other hand, as they say, who does not risk, he does not drink champagne. We tried to come up with a new sorting algorithm and we succeeded. Yes, and with such a rare performance O (nlog 2 n)! And yet, the author still has hope that such a beautiful data structure is simply obliged to have some interesting and useful properties that may be discovered by some of the readers. Thanks for attention!
')
Literature:
  1. Kormen T. et al. - Algorithms. Construction and Analysis, 2009
  2. Fulton W. - Young Tables, 2006
  3. Warren G. - Algorithmic Tricks for Programmers, 2004

PS: By the way, I fully admit the idea that everything described above has already been discovered by someone once. I would be very grateful if anyone has this kind of information and he is ready to share it!

Source: https://habr.com/ru/post/121836/


All Articles