The challenge of a hundred boxes and the rescue of prisoners - the final chord

The surest way to go down in history is to answer who wins in chess with the perfect game of both sides (white, black or friendship). Do grandmasters and supercomputers need to know the truth? Or enough pencil, paper and beautiful ideas?

Mathematics gives hope, because you can prove the existence of an object without presenting it, find the answer, without explaining the deep reasons why it is such.

In the problem of prisoners and a hundred boxes a similar situation. A huge number of possible strategies of the game, one of which intuitively seemed to us the best. But is it possible to substantiate its optimality without plunging into a mess of options?
')
In the post itself about the task of such a question is not posed. However, in the first comment to it, the user mayorovp raises the topic, and slightly below avfonarev reports about a wonderful article revealing a secret.

This should be imbued, especially since the reasoning is simple and elegant. On the whole, the main idea of fasting is not in solving a specific problem (which is also interesting in itself), but rather in once again giving reason to wonder at the power or, as Wigner put it, the incomprehensible effectiveness of mathematics.

Briefly about the condition and solution of the problem

Let me remind you that a hundred players must find tokens with their numbers, arranged in a hundred boxes. Everyone can peek no more than half of them in any order he pleases. In the process of testing, all the boxes and tokens remain in place, players do not meet and cannot exchange information in any other way, the boxes, after looking into them, close again. A team wins only if all participants manage to find their number.

If players open the boxes at random, the probability of winning will be (50/100) ¹⁰⁰ , that is, about 10 ^-29 %. However, once they agree in advance and (attention, spoiler!), The chances of winning are almost 31.2%!

To do this, use the following strategy - each participant needs to start the search from the box with its own number, and then open the boxes with the number of the newly detected token.

To remember more, you can read the original post about the task or watch the one and a half minute video “ Solution to The Impossible Bet ” , which was told andreymironov .

We want to understand why it is impossible (or possible) to win more often than in 31.2% of cases.

Workaround - consider a new game

The strategy described above (let's call it baseline ) is one of the order of 100,100 * 99,100 ^{* 99} * ... * 51,100 ^{* 99 * ... * 51} possible options for the players' actions (see the “Counting the number of strategies” section). Therefore, simply compare head-on with the rest of the strategy.

In the article " The Locker Puzzle " it is proposed to start off to explore another game (let's call it new or second ), in which the boxes are not closed back, and the players are looking for their numbers without restrictions on the number of attempts. The exact rules are:

- players pass the test in turn - first the first, then the second, the third, etc .;

- the first player must open the boxes until he finds his number, no matter how many attempts it takes. The opened boxes do not close;

- the second, third and further players, first of all, try to find their number in all open boxes (to look into an open box is not considered an attempt). If (and only if) this did not bring success, they should open the closed boxes until the necessary token is found. Boxes opened by them also do not close;

- the team wins only if none of the participants needed more than 50 attempts to find their number, that is, if no one had to open more than half of the total number of boxes.

Since the boxes do not close, the more they are open, the easier it is for players to win. For example, if the first participant had to open 35 boxes, then another 34 players would not have to look for numbers at all, and the rest will only have 65 options where their token can be located.

In addition, if a player in search of his number opens, say, 20 boxes, the team will not be able to lose at all, as only 100 - 35 - 20 = 45 unopened boxes will remain, and more than 50 attempts will be spent is impossible.

The victory condition is very interesting. It would seem that you can simply interrupt the game, if someone opened more than 50 boxes. However, for further reasoning, it is more profitable to allow players to open any number of boxes, even if it is clear that they have lost.

The original game, by the way, can also be reformulated in the light of this idea.

Each player searches for his number until he finds it, even if it takes more than 50 attempts. However, the team wins only if no one has exceeded this limit, that is, in the process of searching more than half of the total number of boxes has not been opened. The maximum probability of victory from such an adjustment of the rules will not change.

The new game introduced by us has three interesting properties, which will be discussed further and will allow us to give a reasonable answer - whether the basic strategy provides the maximum probability of winning or not.

Property 1 - it's easier to win a new game.

Rightly it seems that since the boxes do not close, it is easier for the team to win in the new game than in the original one.

Indeed, no matter how the participants adhere to the initial testing strategy, they can also adapt it for passing on a new one.

To do this, those players who did not find the numbers in the open boxes, it is necessary to open the boxes as they would have done in the process of the original game. Moreover, if the strategy tells them to open the already opened box, they simply save the attempt.

It is clear that for the same arrangement of numbers on the boxes, if the players win the first game, they will win the second. Thus, the maximum probability of successful completion of the initial test can not be more than the new.

In other words, if the maximum probability of winning the first game were, say, 40%, then a new test could be successfully completed with the same or greater chances - let's call it property 1 .

Property 2 - in the new game, nothing depends on the team’s strategy

If you think about it, then it does not matter to the first player, what strategy to follow - no matter how hard he tries, the probability of finding your number, while meeting the limit on the number of attempts, will be 50/100 = ½. It does not depend on the algorithm chosen by him and on how many boxes he opens - one, two, thirty, or all one hundred.

Players whose tokens have already been found by previous participants, according to the conditions of the game, find their number one hundred percent and for zero attempts - nothing can be improved in their algorithm of actions.

Those team members who are forced to look for their numbers themselves are, in fact, in the same situation as the first player - the chances of success depend on the number of closed boxes, and not on the contents of the already opened or chosen strategy.

There is a feeling that the probability of winning a new game does not depend on the algorithm of the team’s actions. We call this property 2 and justify it more formally.

For convenience, we suppose that we have six boxes and players, and three attempts.

Imagine that the participants began to pass the test under the new rules, and we want to outline their success. It turns out that the game process can be described in sufficient detail with just one line of numbers, indicating the numbers of the tokens found by the team in the order in which they were found.

For example, it is easy to decrypt a record (2, 5, 1, 3, 6, 4). After all, according to the condition of the game, the first player had to open the boxes until he found 1. Therefore, it is possible to unequivocally restore that the first participant, having opened three boxes of some kind (not important), found numbers 2, 5 and 1 there. I didn't have to, his token was in the box already opened by the first player. Then the third achieved success from the first attempt and, finally, the fourth one found 6 and 4. The fifth and sixth participants, as well as the second, did not need to work.

Another example - let us assume that the results of passing the test are described by the line (6, 1, 5, 3, 4, 2). It is easy to see that the first participant found his token already in the second open box, while the second player was not lucky, it took four attempts, so the team counted defeat.

It turns out that the string of six numbers corresponding to the numbers of the tokens found by the players uniquely determines the result of the test - the team won or lost.

By the way, strings of the type (2, 5, 1, 3, 6, 4) and (6, 1, 5, 3, 4, 2), that is, those that contain numbers from 1 to 6 without repetitions, are called permutations of numbers from 1 to 6. It is known that there are 6 such permutations in total! = 6 * 5 * 4 * 3 * 2 * 1.

Note that since the tokens are arranged in boxes randomly, each of the boxes with equal chances can be any of the six numbers. Therefore, no matter what algorithm the first player sticks to (even if he makes a choice at random), in the first box opened by him with equal probability 1/6 any of six numbers may appear.

In the second open command box (no matter who - the first or the second participant - and according to what considerations), one of the five remaining numbers may be with the same probability 1/5.

Continuing the argument, we come to the fact that, writing down the numbers of tokens in the order in which they were found, you can with the same probability (1/6) * (1/5) * ... (1/2) * (1/1) = 1/6! get any of 6! possible permutations of numbers from 1 to 6.

As shown above, some permutations, for example, (2, 5, 1, 3, 6, 4), indicate the gains of the players, others, in particular, (6, 1, 5, 3, 4, 2) - the loss. Denote by A the set of all permutations that signify the victory of the team, and | A |, respectively, the number of such permutations.

Since all permutations can turn out to be equally likely as a result of the game, the chances of players winning are equal | A | / 6! and, apparently, do not depend on the strategy chosen by them. Property 2 is proven.

Property 3 - the probability of winning when using the basic strategy is equal to the chances of winning a new game

Let's go back to the original game. Recall that the probability of winning it using the basic strategy (the one in which the player must begin searching from the box with his own number, and then follow the numbers on the tokens found) is <the number of permutations of numbers from 1 to 6 that do not contain cycles of length more than 3> / 6! (as before, it is assumed that there are six boxes and players, and three attempts).

If we denote B , the set of permutations of numbers from 1 to 6 that do not contain cycles of length more than 3, and | B |, respectively, the number of such permutations, then the formula described above takes the form | B | / 6!

About cycles and how this formula is made ...

Suppose that the location of the tokens in the boxes is described by a permutation (2, 5, 3, 6, 1, 4), that is, in the first box is 2, in the second - 5, in the third - 3, etc. .

If the players, applying the basic strategy, did not stop moving, having found their token, they would start walking in a circle. For example, the first participant, looking into the first box, would move to the second, then the fifth, from there to the first again, and so on. Between the same numbers would go the second and fifth players. The third participant would open the third box all the time, and the fourth and sixth would rush between the fourth and sixth boxes.

This means, in essence, that the permutation (2, 5, 3, 6, 1, 4) is divided into cycles [1, 2, 5], [3], [4, 6].

Moreover, the same cycle can be designated in several ways, say, [1, 2, 5] = [5, 1, 2] = [2, 5, 1], that is, a cyclic shift is allowed in the record.

It is also said that the permutation is represented as a product of disjoint cycles : (2, 5, 3, 6, 1, 4) = [1, 2, 5] [3] [4, 6]. Such a representation is not unique, since the above stated that the cycles can be written differently, in addition, they can be rearranged among themselves, in particular, (2, 5, 3, 6, 1, 4) = [1, 2, 5] [3] [4, 6] = [3] [1, 2, 5] [4, 6] = [3] [5, 1, 2] [6, 4] and so on.

However, if we fix the way of recording cycles (for example, use only the record in which the minimum element is located on the last place: not [1, 2, 5] or [5, 1, 2], namely [2, 5, 1]) and also fix the order of the cycles (for example, in ascending order of minimum elements (marked bold): not [6, 4 ] [ 3 ] [2, 5, 1 ], but [2, 5, 1 ] [ 3 ] [6, 4 ]), then the representation of the permutation in the form of a product of non-intersecting cycles will be unique. For example: (2, 5, 3, 6, 1, 4) = [2, 5, 1 ] [ 3 ] [6, 4 ].

The basic strategy will bring players a win only if no one has to rush more than between three boxes, in other words, only if all cycles are 3 or less long. Hence the probability of victory: <the number of permutations that do not contain cycles of length greater than 3> (that is, | B |) divided by the total number of permutations (6!).

Let us also recall the reasoning just conducted in the previous section, according to which the maximum (and the only possible) probability of winning a new game is calculated using the formula <the number of permutations meaning the players win> / <the total number of permutations> or | A | / 6! in our notation.

Note that the permutations included in the sets A and B are somewhat similar. Both those and others, in particular, are broken "into pieces", containing not more than 3 numbers.

So, the permutation (2, 5, 1, 3, 6, 4) from A can be divided into three parts <2, 5, 1>, <3>, <6, 4>, according to the principle that only those numbers that are open by one player (<2, 5, 1> found by the first participant, <3> by the third, <6, 4> by the fourth).

A distinctive feature of such "pieces" is that the minimum element in them is always in last place, and the "pieces" themselves are arranged in ascending order of minimum elements. We emphasize this by highlighting the minimal elements in bold: <2, 5, 1 >, < 3 >, <6, 4 >.

In turn, the permutation of B , say, (2, 5, 3, 6, 1, 4), can be represented as a product of disjoint cycles: (2, 5, 3, 6, 1, 4) = [1, 2, 5] [3] [4, 6] = [3] [1, 2, 5] [4, 6] = [3] [5, 1, 2] [6, 4] = ...

Of all the possible variants of such a representation, we choose the only one in which the minimal elements of the cycles are in the last place, and the cycles themselves are arranged in ascending order of their minimal elements, i.e. (2, 5, 3, 6, 1, 4) = [2, 5, 1 ] [ 3 ] [6, 4 ].

It turns out that the permutation (2, 5, 3, 6, 1, 4) from B is conditionally divided into “pieces” <2, 5, 1 >, < 3 >, <6, 4 >, exactly the same as the permutation (2, 5, 1, 3, 6, 4) of A. This tells us an algorithm that allows us to turn any permutation from A into a permutation from B and vice versa.

Indeed, take a permutation (2, 5, 1, 3, 6, 4) of A , divide it in the manner specified above into “pieces” <2, 5, 1 >, < 3 >, <6, 4 >, replace in this writing the angle brackets into square brackets, and commas - by the multiplication symbol of the cycles (in our case by the space) and we get [2, 5, 1 ] [ 3 ] [6, 4 ] = (2, 5, 3, 6, 1, 4 ) of B.

In the opposite direction also: (2, 5, 3, 6, 1, 4) from B we represent in the form [2, 5, 1 ] [ 3 ] [6, 4 ], change the brackets, add commas and “glue” the resulting pieces <2, 5, 1 >, < 3 >, <6, 4 > into the permutation (2, 5, 1, 3, 6, 4) of A.

Here is another example of mutual transformation: (6, 1, 4, 2, 3, 5) from A <-> <6, 1 >, <4, 2 >, < 3 >, < 5 > <-> [6, 1 ] [4, 2 ] [ 3 ] [ 5 ] = (6, 4, 3, 2, 5, 1) from B.

Thus, the elements of the sets A and B are divided into pairs that turn into each other according to the above-mentioned permutation algorithm. In the language of mathematics, this means that there is a bijection between sets A and B , therefore, they have the same number of elements, i.e. | A | = | B |.

This means that the chances of successfully passing the initial test using the basic strategy (| B | / 6!) And the probability of winning a new game (| A | / 6!) Are equal. We call this property 3 .

Conclusions - the optimality of the basic strategy

For clarity, properties 1, 2 and 3 are proved by the example of a game with six boxes, six players and three attempts. It is clear that exactly the same reasoning can be repeated in the general case, when there are n boxes and players (for example, 100 each), and k attempts (for example, 50).

From the considered properties two statements follow:

1) The probability of winning a new game does not depend on the team’s strategy (property 2) and is exactly equal to the chances of successfully passing the initial test using the basic strategy (property 3).

2) The basic strategy provides the maximum probability of winning the team in the original game , as if any other strategy would provide a greater probability of victory, then a new test (property 1) could be passed with greater chances of success, which contradicts statement 1 ) above.

Thus, if each player starts the search from the box with his own number, and then opens the boxes with the number of the newly detected token, the team will win with the greatest possible probability. Nothing better for the participants to come up with. Now this is not only an intuitive offer, but also a proven fact!

Just think, the enormous number of possible strategies of the game just stood aside - we answered the question about the optimality of one of them, and did not understand what its fundamental difference from the others!

We didn’t even have to calculate this very probability of victory, the maximum of which we proved! We didn’t have to consider such issues at the core of the problem, like: what is a strategy and how many are there? Is the basic strategy the only optimal one or are there others? What formula is used to calculate the maximum probability of winning in the case of n boxes, m players and k attempts?

That is why I wrote at the very beginning that mathematics instills hope - after all, one can make history simply by proving, for example, that in chess with an ideal game of both sides there will always be a draw or a victory of whites or blacks (it would be amazing to even whites from the very first move in zugzwang!). It is possible that due to the "incomprehensible efficiency" of mathematics, its amazing ability to beautifully and accurately answer the questions posed, this does not need to penetrate deep into chess theory or use inconceivable computing power.

Forward to new heights! For those who are interested in finding out more deeply with the task of simply boxing and rescuing prisoners, bonus chapters are in store.

Bonus Counting the number of strategies

The algorithm of the team's actions (let's call it the collective strategy of the players) is made up of the algorithms of actions of each player individually (let's call them the individual strategies of the players).

We will assume that the participants choose the number of the next box, which they will open, uniquely, based on the prehistory, that is, they do not use a random mechanism (they cannot, for example, choose any of the remaining boxes randomly).

Then the individual strategy of, say, the first player can be described by a line in the table of the following form:

Move 1	Move 2				Move 3
	2	3	...	n	...

The first column (“Move 1”) records the number of the box from which the first player begins the search. In columns 2, 3, ..., n (under the general heading “Move 2”) the number of the next box is recorded, which it will open if the counter with the number 2, 3, ..., n is found in the first box opened by it, respectively . To specify move 2, n - 1 numbers are required (not n , since there is no column with name 1, because if participant found 1 token 1 (that is, his own token), then there is no need to continue the search).

The columns under the general “Turn 3” header contain the numbers of the third box in which the player will search for his token, depending on what numbers he found in the first two opened boxes. To set move 3, you will need to specify already ( n - 1) * ( n - 2) numbers (at move 1, one of n - 1 numbers can be detected, and at move 2: n - 2, since the number detected at move 1 no longer fall).

The cap for "Move 3" looks like this:

Move 3
2				3	...	n
3	four	...	n	...	...	...

If you need to set the move k, you will need ( n - 1) * ( n - 2) * ... * ( n - k + 1) columns.

Let us calculate how many ways you can fill a row of such a table.

In the first column, you can write any of the n numbers. The columns for the second attempt can be filled in ( n - 1) ^{n - 1} ways, since in each of ( n - 1) cells you can enter any of the ( n - 1) numbers of the boxes that are not yet open.

The options for filling the columns for the third attempt are already ( n - 2) ^{( n - 1) * ( n - 2)} , since in each of ( n - 1) * ( n - 2) cells you can substitute any of ( n - 2) non-open box numbers, etc.

Columns corresponding to the kth attempt can be filled in ( n - k + 1) ^{( n - 1) * ( n - 2) * ... * ( n - k + 1)} ways.

As a result, there is n * ( n - 1) ^{n - 1} * ( n - 2) ^{( n - 1) * ( n - 2)} * ... * ( n - k + 1) ^{( n - 1) * ( n - 2) * ... * ( n - k + 1)} individual strategies of the first player. It is clear that any other player has the same number of strategies.

Players act independently of each other. A collective strategy is a combination of individual strategies, so their total number is obtained by multiplying the number of individual strategies of each player:
n * ( n - 1) ^{n - 1} * ( n - 2) ^{( n - 1) * ( n - 2)} * ... * ( n - k + 1) ^{( n - 1) * ( n - 2) * ... * ( n - k + 1)} to the power of n is equal to n ⁿ * ( n - 1) ^{n * ( n - 1)} * ( n - 2) ^{n * ( n - 1) * ( n - 2)} * ... * (( n - k + 1) ^{n * ( n - 1) * ( n - 2) * ... * ( n - k + 1)} .

If there are not as many players as boxes, then it is necessary to build not to the power n , but to the power m , where m is the number of players.

Bonus Uniqueness of optimal strategy

An interesting question is whether the basic strategy is the only optimal one. Can, for example, players begin searches not from the box with their own number, but from some other?

Imagine that before the very beginning of the test the box was renumbered. It is clear that this will not affect the players' strategies or the probability of winning. After all, to renumber the boxes is just the same as placing the tokens, but we already believe that they are laid out randomly.

Players can themselves mentally renumber the boxes. For example, count the fifth box as the seventh, seventh - the tenth, and the tenth - the fifth. In this case, using the principle of the same basic strategy, say, the fifth player will start the search from the seventh box and if he finds a 7-k in the process of testing, he will go to look in the tenth, and having found 10-ku - in the fifth.

If all players equally mentally re-number the boxes, then the probability of winning the team will remain exactly the same. Along with the reasoning above, one can cite a formal, purely technical proof of this fact (I think it is not so interesting to rewrite it in this post).

As a result, since n boxes can be numbered n ! in ways, then optimal strategies are at least n !, however, in essence, all these are variations of the same algorithm of actions. Most likely, if there are as many players as there are boxes, there are no other optimal strategies that truly differ from the basic ones. We have not yet found evidence of this fact with a friend.

If there are fewer players than boxes, then other optimal strategies are possible, which really differ from the basic ones. For example, if we have 4 boxes, 2 players and 2 attempts each, participants can act as follows.

The first player strategy is described by the table row:

Move 1	Move 2
	2	3	four
one	2	3	3

The second player strategy is described by the table row:

Move 1	Move 2
	one	3	four
2	one	four	four

In this case, the team will win with a probability of 10/24 (10 out of 4! = 24 possible locations of the tokens in the boxes), just as in the case of using the basic strategy.

About the decision in general

Taking into account the current understanding of the game, we can assume that the maximum probability of winning in the case of n boxes, m players and k attempts is calculated by the formula:
<the number of permutations in which the elements 1, 2, ..., m are not included in cycles of length greater than k > / n !

In the simplest cases, for example, for n = m , k ≥ n / 2, it is proved and looks quite nice: 1 - 1 / ( k +1) - 1 / ( k + 2) - ... - 1 / n .

A formal proof of the formula and its derivation in explicit form for various values of the parameters is an interesting problem. , .

Source: https://habr.com/ru/post/256513/

All Articles