How can using randomness help make your code faster? Lecture of Michael Sluggish in Yandex

And the strength and weakness of modern computers is how accurate they are. Today in our series of lectures from Yandex, there is a story about how the use of accidents can help make calculations more efficient.

Probabilistic algorithms allow solving some problems of theoretical informatics for which deterministic algorithms do not work. The most interesting question is to what extent does the use of randomness reduce the running time of the algorithm? In part, this question can already be answered: under some assumptions, true randomness can be replaced by a false one and deterministically simulate any probabilistic algorithm with a slight loss in operating time. Verification of these assumptions is likely to be one of the central themes of theoretical informatics of the 21st century.

')
The lecture is read by a senior researcher at the Computing Center named. A.A. Dorodnitsyna RAS, Associate Professor of the Department of Mathematical Foundations of Management MIPT, Ph.D. in Physics and Mathematics Mikhail Vyalyy.

Let's start with the simplest. Imagine that we have two calculators. One is normal, and the second has an extra button that, when pressed, gives an extra bit. Let's try to answer the question, will such a function be useful?

Such a statement, of course, is too general. We will try to clarify it in terms of theoretical informatics. To do this, we first introduce the concept of the algorithm. The algorithm is an instruction so precise that it can be executed mechanically. The main property of deterministic algorithms is that each next state is uniquely determined by the current state. Probabilistic algorithms are different in that at any moment they can determine the value of a random bit (flip a coin), which with equal probability will be 0 or 1. In the process of executing a probabilistic algorithm, this can occur repeatedly, and different flips will be independent.

As shown in the picture above, with a deterministic step, the calculations occur as usual. However, when tossing a coin, the calculation forks. Instead of a linear sequence, a computation tree is obtained. Each branch of this tree is called a calculation path. The path is characterized by the values of random bits. Each time, throwing a coin, we choose one of two directions of movement. Thus, we allow some liberty: we allow the algorithm to work correctly not on all paths, but only on some. Before proceeding to assessing the error possibilities of probabilistic algorithms, we will discuss some technical details.

First, a coin flip can have many sides (outcomes). Secondly, the probabilities of loss of all possible outcomes do not have to be equal. However, the sum of all possible outcomes is always 1.

Counting the error probability

When calculating the probability of error of a probabilistic algorithm, two rules must be considered:

The probabilities of independent events are multiplied.
The probabilities of incompatible events add up.

How generally to concern algorithms which can be mistaken? In the end, you can simply ask some question that requires the answer “yes” / “no”, flip a double-sided coin, and with a probability of 1/2 get the correct answer. So what is the probability of making a mistake for us? For example, is the probability of an error high at 9/10? If the error can be detected, and the algorithm allows for repeated execution, then not so much.

If the correct answer is unknown to us, the task becomes more difficult. The algorithm gives us the answer "yes" or "no." If the error probability ε <1/2, then repeating the algorithm allows you to quickly reduce the probability of error. Those. you need to repeat the algorithm k times, and give the answer that met more often. Let the error probability be 1/3. The probability of error in voting for k independent performances is calculated as follows:

If the error probability is also equal to 1/3, if 100 independent performances vote, the error probability can be calculated as follows:

As k increases, the probability of error will decrease very quickly:

The task of establishing contact

Suppose we have two players. They know nothing about each other, and they have no opportunity to agree. There are two points through which they can make contact. Time is discrete. At each moment the participant chooses a place (upper or lower). Contact is established if at some point in time both participants chose the same place.

There is no deterministic way to make contact. At the same time, the probabilistic algorithm for establishing contact is very simple: at each step, one must choose a place randomly, equiprobably, and independently of the previous steps. Let's analyze this algorithm. The probability of error at each step is 1/2. After t steps, the probability of error will be 2 ^-t , which can be a very small value.

The task of verifying equality by an arbitrator

Consider a problem with a more complex condition. Alice knows the binary word x of length n . Bob knows the word y of the same length. They have the ability to transfer Charlie some information (the words u and v ), according to which Charlie must decide whether the words x and y are equal. Goal: send as few bits as possible, provided Charlie responds correctly to any x, y . Alice with Bob can not communicate with each other.

If u and v are determined by x and y deterministically, i.e. u = f (x), v = g (y) , then to solve the equality problem, at least 2 n bits must be transmitted. Path length u is less than n . Then the possible values of u are less than 2 ⁿ , i.e. less than the number of possible x values.

For Charlie, x ₁ and x _{2 are} indistinguishable.

Accident as a "communication channel"

Let's try to solve the same problem, but now imagine that Alice and Bob have access to a common source of chance. Say, to the same edition of the book "Table of random numbers." The goal: to transmit as few bits as possible, provided that the probability of a Charlie error is small on any x, y .

Let us try to solve this problem and determine whether fewer bits can be transmitted in this way than with a deterministic algorithm.

Alice and Bob choose a random prime number p (the same, commonness for both of them) in the range from n to 2 n .

Alice computes U = X mod p , where X is a number whose binary notation is the same as Alice's x . Bob does the same and calculates V = Y mod p . Then they both send binary records of Charlie's U and V numbers. Charlie says that the words x and y are equal if u = v .
Since 0 ≤ U, V <p ≤ 2n , the message length ≤ 2 + log n .

In the case of x = y, Charlie always gives the correct answer. If x ≠ y the probability of Charlie’s error is no more than 3/4. Let us prove this with the help of a theorem from number theory.

There are a lot of prime numbers. For all sufficiently large n, the number π (n) of primes from 1 to n satisfies the inequalities:

If X - Y ≠ 0 is divisible by simple n ≤ p ₁ <... <p _k ≤ 2n , then

Reduction in error probability

The error of the proposed protocol is one-way. This allows you to reduce the likelihood of error by repeatedly running the protocol. If Alice and Bob choose s simple modules (randomly and independently), the probability of error becomes less than (3/4) ^s . Taking a sufficiently large s , the probability of error can be made arbitrarily small.

If there is general randomness for any ε > 0, there is a way to select messages that guarantees the solution of the equality problem with an error probability less than ε and transmits O (log n ) bits. Writing g (n) = O (f (n)) means that there are numbers C and n ₀ such that for all n> n ₀ , g (n) <Cf (n) holds.

Protocols in the absence of general randomness

How to prove that there is a way to select messages u, v in the problem of equality, in which Alice and Bob do not know the random bits of each other, O (√n log n ) bits are transmitted, and the error probability is less than 1/3?
Two random subsets of size 2 √n in the n- element set intersect with a probability of> 4/5 (approximately 1 - 1 / e ² ). And it turns out that this is almost all that we can achieve. Consider a special case of the Babai – Kimmel theorem, from which it follows that with any method of probabilistic selection of messages in the equality problem, which guarantees the probability of error is less than 1/3, the length of the transmitted messages is Ω (√n) . Writing g (n) = Ω (f (n)) means that there are numbers C and n ₀ such that for all n> n ₀ , g (n)> Cf (n) holds.

After watching the lecture to the end, you will learn how, thanks to the introduction of randomness, you can speed up the work of the algorithm, as well as the concept of fake randomness.

Source: https://habr.com/ru/post/214745/

All Articles