Gray codes and enumeration tasks

This article will show a mathematical approach to the compilation of algorithms on the example of the following questions and tasks:

Gray binary codes. Their existence. Enumerate subsets of a given set in order of minimal change.
The existence and implementation of enumeration of subsets of k elements in the order of minimal change.

So let's get started.

Gray's binary code

Gray's binary code of order n is the sequence of all

n -bit codes, in which any two adjacent codes differ in exactly one bit.

An example of gray codes of order 2:

00
01
eleven
ten

It is easy to see that such a sequence is not the only one (it can at least be reversed). Therefore, let's understand the existence of binary Gray codes of other orders and at the same time we will single out some one type of such sequences for further work.
')

Existence

The existence of gray codes is easily proved by induction.
Base -

- empty sequence.
Transition. Let be

- sequence of gray codes of order

- sequence

written in reverse order. Then the sequence

is obtained by assigning a zero to the left of each code in the sequence

and the units on the left to each code

and their subsequent unification, i.e.

. It is easy to verify that the property of the Gray codes is in the sequence

constructed in this way is not disturbed.

Gray codes obtained in this way are called reflexive or symmetrically reflected . In the future, we will consider exactly the reflexive binary gray codes.

As an exercise, write out symmetrically reflected Gray codes of order 3 using the formula given in the proof:

.

We now turn to the application of Gray codes for solving some combinatorial problems.

Enumerate the subsets of a given set in order of minimal change.

Consider the following problem: “Given a set of n elements. It is required to display all its subsets in such an order that each subsequent subset is obtained from the previous one by deleting or adding exactly one element (that is, in the order of minimal change). ”

We assume that the elements of our set are numbered from 1 to n . Then any set of elements of the set can be associated with an n- bit binary code: if the i -th element of the set is included in the set, then the i- th bit of the code is one, otherwise zero. Thus, enumeration of all subsets is reduced to enumeration of binary codes of order n , and enumeration of subsets in order of minimal change - to enumeration of binary Gray codes.

The idea of the recursive iteration algorithm for Gray codes necessarily follows from the proof of their existence, namely, the recurrence relation given in it

Actually, as mentioned in the proof, the Gray code of order n is obtained from the Gray code of order n-1 by assigning zero to the left (this corresponds to the function call

and assignment before it) and the reverse Gray code of the order n-1 by assigning one to the left (this corresponds to the function call

and assignment before him).

Let us turn to assess the complexity of this algorithm. Obviously, the function call tree

is a complete binary tree of height n . Consequently, the total number of function calls when generating Gray codes of order n is nothing more than

, which is equal to the number of all possible n- bit codes. Therefore, the algorithm is optimal.

As an exercise, the reader is invited to analyze the following, shorter, recursive search algorithm.

Example

I would also like to say about the existence of an iterative search algorithm and other useful properties of Gray codes, which are not considered in this article, and which can be learned from other sources, such as Wikipedia.

And now let's move on to the last and most interesting task.

Enumerate the subsets of k elements in the order of minimal change.

Let's start with the statement of the problem: “A set of n elements is given. It is required to display all its subsets consisting of exactly k elements, in such an order that each next subset is obtained from the previous one by replacing exactly one element (ie, in the order of minimal change). ”

Here, the idea of the algorithm from the previous problem immediately comes to mind, but we will only output Gray codes with exactly k ones. This raises the question of the correctness of this algorithm: no one guarantees that the neighboring codes in the sequence

in which only binary codes are left with the number of units equal to k , are arranged in the order of minimal change — with a difference of exactly two digits — are obtained from each other by replacing one element. Therefore, we proceed to the further study of the properties of symmetrically reflected gray codes.

We introduce some notation:

- the string obtained by concatenating the symbol x ( m times) and concatenating the symbol y ( k times). Example: .
- a sequence of all n- bit codes with k units, in which the codes are in the order in which they are located in the sequence .
- a sequence of all n- bit codes with k units, in which the codes are in the order in which they are located in the sequence .

Lemma

First binary code with

units in sequence

has the appearance

and the last one is

, if a

Evidence

We will prove by induction on

and

.
Base -

- true (it is easy to check the formulas for the two codes 0 and 1).
Transition. Let the formulas be true for

and any

. Recall the formula for proving the existence of Gray codes:

. From here, we get that the first code with

units in sequence

with a zero assigned to the left is the first code with

units in sequence

. With

we will get a single code with

unit for which the formula is also true. With

latest code with

units in sequence

there is a first code with

unit and unit assigned to the left in the sequence

. With

the formula is also true.

Theorem

Neighboring binary codes in sequence

differ in exactly two digits.

Evidence

We will also prove by induction on

and

.
Base -

- obvious (only one possible code in the sequence).
Transition. Let the theorem be true for

and

. With

we have only one code in the sequence

. With

neighboring words are completely in

or in

by induction hypothesis, they differ exactly in two digits. It remains to check one pair of adjacent codes in

latest code with

units in

and the first code with

units in

(last with

unit in

with the unit assigned to the left). By the lemma, with

they have a look

and

respectively, and at

and

. It is not difficult to see that they differ exactly in two digits.

Now we can formulate remarkable corollaries of the theorem.

The naive algorithm proposed earlier is correct, that is, if in sequence leave only those binary codes in which there are exactly k ones, then they will be arranged in the order of minimal change (each following one is obtained from the previous one by replacing one element).
Sequence is given by the following recurrence relation: . This can be verified by induction.
Sequence is given by the following recurrence relation: . Also verified by induction.

These consequences can and should be used to write a program for recursive enumeration of subsets of k elements in the order of minimal change.

I would like to draw attention to the fact that the output of a binary code also implies the filling of the remaining digits with ones or zeros, depending on the value of u and v .

Let us turn to an assessment of the complexity of the above algorithm. Note that when deepening into recursion, the value of u always decreases. That is, we will make no more than n holes to achieve some binary code. All such codes -

. We get the final asymptotics

. This is a rather rough estimate, but it is any better.

, especially when the value of k tends to n : the algorithm is almost linear.

It is easy to see that the call tree

will be binary, but not complete. Let us make a parallel assessment of complexity in terms of trees: each sheet corresponds to the receipt of some binary code, and the distance to the sheet in the tree does not exceed n . Evaluation

tells us that we simply sum up the upper bound for the lengths - n - of all the paths to the sheets. But many of these paths have a length less than n , and even more paths intersect with each other. This should make us think that the asymptotics of the above algorithm is actually much better.

An exact estimate of the asymptotics is far from a trivial fact, which can be found in the book of E. Reingold, J. Nivergelt and N. Deo “Combinatorial algorithms. Theory and practice ", or try to prove yourself.

On this I would like to finish the article. Thank you for your attention and see you soon.

Source: https://habr.com/ru/post/200806/

All Articles

Gray codes and enumeration tasks

Gray's binary code

Existence

Enumerate the subsets of a given set in order of minimal change.

Enumerate the subsets of k elements in the order of minimal change.

Lemma

Evidence

Theorem

Evidence

More articles: