Analysis of all tasks of the final round of Yandex. Algorithm 2015

The final of Yandex.Algorithm has finished today - the annual championship on sports programming, which is organized by Yandex. In 2015, the competition was completely online - on the Yandex.Contest platform. Applications for participation were filed by programmers from 73 countries. Most of the participants are from Russia, Ukraine, Belarus, Kazakhstan, India, USA, Japan and China, but in general the geography of the championship is extremely extensive - Brazil, Indonesia, Peru, Dominican Republic, Mozambique, Senegal, Cayman Islands. 8.9% of those registered are girls. Approximately half of all participants are students. In total, we received applications from 3,722 people, of whom 28 reached the final.

And the winner of Yandex.Algorithm-2015 was Gennady Korotkevich. ~~Out of habit, he~~ showed the best result, deciding in the final round five of the six problems, while receiving 80 minutes of penalty time. Gennady took the first place in the Yandex Championship both in 2013 and in 2014.

')
The second place went to Peter Mitrichev, and the third - Yevgeny Kapun. They solved four problems, while Peter scored 31 penalty minutes, and Eugene - 79 minutes. The results of all the finalists can be viewed on the Yandex.Algorithm website.

The tasks for Yandex.Algorithm are made by an international team, which includes both Yandex employees and invited experts - including the winners and finalists of the ACM ICPC and Topcoder Open competitions. And we have traditionally prepared for you an analysis of all the tasks. Solve all of them, no one managed. Most of the participants coped with task B, but tasks A and D decided on just one person.

Task A. Task Manager

The author of the analysis and tasks is Gleb Evroprop. A graduate of Moscow State University, a double gold medalist ACM ICPC.

The intern Bomboslav is working on improving the task manager used by Yandex employees. The current version of the manager associates with each of the n tasks assigned to the employee two parameters c _i and u _i - the importance and urgency of the task. Higher values of the parameters correspond to the greater importance or urgency of this task.

Any employee can choose the order in which he will perform the tasks assigned to him. The only restriction is as follows: if task i is both more important and more urgent than task j , that is, c _i > c _j and u _i > u _j , then it should be performed earlier.

Bobmoslav decided to add to the task manager a recommender system that will prompt in what order it is necessary to perform the existing tasks. Since it is too easy to build such an order, Bomboslav added the ability for employees to specify for each task the value of p _i , meaning the pleasure received from performing this task. The values of p _i for different tasks must be different.

After the value of p _i is specified for each of the n tasks, the system should offer the employee such an order of execution that, firstly, no mandatory restrictions are violated, and secondly, the sequence of the values of p _i corresponding to these tasks is lexicographically maximum . In other words, from all sequences of performing tasks that meet the constraints on importance and urgency, the one in which the pleasure received by the employee from the first completed task is selected is selected. In case there are several such options, the one in which the pleasure from the second completed task is maximized is chosen, and so on.

Input format The first input line contains a single number n - the number of tasks for this employee (1≤ n ≤ 100 000) .
The following n lines describe the tasks in the manager. Each of them contains three non-negative integers c _i , u _i and p _i - importance, urgency and desire to perform this task by an employee, respectively (0 ≤ c _i , u _i , p _i ≤ 10 ⁹ ) . It is guaranteed that all p _i are pairwise distinct.

Output format Output the correct sequence of execution of all tasks by the employee, in which the sequence of the corresponding values of p _{i is} lexicographically maximum. Tasks are numbered from one in the order in which they appear in the input. Each task must occur in the sequence exactly once. It is easy to show that the answer is always the only one.

Problem Solving A

Note that the condition for the difference of all p _i is essential, since it allows us to use the greedy algorithm, which iteratively extracts the minimum source. A more detailed description of the algorithm:

Let some prefix of the optimal answer be constructed, only a certain subgraph remains from the initial graph.
Consider the set of sources of the current graph, that is, the set of vertices into which no edges enter. Obviously, the current topological sorting needs to be continued from one of the vertices belonging to this set. Let's call these vertices active.
Since all p _{i are} different, then due to the structure of the lexicographic comparison, the next vertex is determined uniquely. Add an active vertex with maximum p _i to the current response prefix and remove it from the graph.

The asymptotic complexity of this solution is O (V ∙ E) , if you simply implement a text description of the algorithm. The asymptotics O (V ∙ logV + E) is easily achieved — all that is needed is to memorize the current half-input for all vertices and store the set of active vertices in some structure with the ability to quickly add an element and extract the maximum. Such a structure can be a binary heap or any balanced search tree.

However, in this problem, the graph is given implicitly, and, generally speaking, E may have a size of the order of n ² , which is too much for these constraints. It is required to think of a way to quickly determine the new active vertices, after removing the next source from the graph.

Let us turn to the idea of a compressed two-dimensional tree of segments. A detailed analysis of this structure is not supposed here. We only recall that it has the dimension O (n ∙ logn) from memory and allows you to respond to group requests (such as maximum, minimum, or sum in a rectangle during O (log ² n) ). This complexity is achieved by dividing the query area into O (logn) lanes by one of the dimensions, for each of which the structure of a one-dimensional tree of segments is already constructed. In turn, each of these bands is again divided into O (logn) parts.

We construct the partitioning in the previous paragraph for each of the V angles of the form x> x _i and y> y _i . Then the vertex falls into the set of active at the moment when all the elementary regions that make up this angle become empty. For each elementary region, we will maintain the current number of points inside it, as well as for each point, the number of elementary regions that form its angle and are not yet empty. Please note that not all elementary areas that fall into this angle are meant here, but only those O (log ² n) of them that form a partition of the area in the query to a two-dimensional tree of segments.

The step of the algorithm now looks like this.

Select the active vertex with the maximum pi and remove it from the set of active ones.
Consider all the elementary regions of a two-dimensional tree of segments that contained a point corresponding to a given vertex. According to the properties of this structure, there will be no more than O (log ² n) . Each such area to reduce the counter by one.
For all areas whose counter has become zero, it is necessary to consider a list of all points - such that this area is included in the decomposition of the corresponding angle. These lists are built in advance, even before the launch of the main algorithm. For each point in the list, the counter is also reduced by one.
For all points whose counter has become zero, the corresponding vertices are added to the set of active ones.

Total running time: O (n ∙ log ² n) , as well as occupied memory.

Problem B. At a picnic on the bus

The author of the task and analysis is Mikhail Tikhomirov. Graduate of MSU, Topcoder Open 2014 finalist.

Every summer, Yandex employees go to nature on Yandex.Picnik. For employees who do not have their cars, buses are rented from the office to the place of rest.

You have information about the number of employees who need a bus. In addition, some groups of employees would like to occupy seats in the same row of seats in the bus. The bus, which provides the transport company, in the same row are six seats. Before ordering, it is required to determine how many rows of six chairs are needed to accommodate all groups of employees as desired.

Input format The first line contains a single integer (1 ≤ n ≤ 100) - the number of groups of employees. The next line contains n integers a _i , separated by spaces - sizes of groups (1 ≤ a _i ≤6) .

Output format Print one number - the minimum number of rows of six seats that can accommodate all groups according to the wishes.

Solution of Problem B

The general idea: if the number of places in a row does not exceed six, a simple greedy algorithm works.

Obviously, each group of size 6, 5 or 4 must be placed in a separate row. Groups of size 3 should be as tight as possible: fill

rows in pairs of triples, and, perhaps, one of the remaining three to put in a separate row. Indeed, suppose that there are two rows in the optimal partition, each of which contains one triple. Then we swap one of the triples with those groups that sit in the same row as the second trio. After several such operations, we move all triples so that the described condition is fulfilled.

Now arrange the twos and single groups. If you arrange everything except for units, then after that the arrangement of units is determined unambiguously: we fill in the remaining free places in the used rows, and then completely fill in the new rows one by one. It is easy to see that deuces should be placed in the same way: as long as there are at least two places in one of the used rows, we will put another deuce there.

The described algorithm can be briefly described as follows: we consider groups in descending order of size, each placed in a row, in which there is less than the entire place, but enough for our group.

Problem C. The whole work

The author of the task and analysis is Oleg Khristenko. Coordinator of Yandex.Algorithm, Open Cup, snarknews project.

Given N - 1 fractions u_i / d_i , with 2 ≤ u _i ≤ N and 2 ≤ d _i ≤ N , all u _i are pairwise different and all d _i are pairwise different. Choose at least 1 and at most N / 10 fractions so that their denominators are powers of primes, and the product of all selected fractions is an integer.

Input format The first line contains one integer N (10 ≤ N ≤ 10 ⁶ ) . Each of the following N - 1 lines contains one fraction in the format u_i / d_i (2 ≤ u _i , d _i ≤ N, u _i ≠ u _j for i ≠ j, di dj with i j ). There are no spaces between the numbers and the “/” sign.

Output format Print two lines. The first must contain an integer M - the number of selected fractions. The following M lines must contain the selected fractions themselves, specified in the same format as in the input, one per line. Denominators of all selected fractions must be powers of primes, that is, have the form

where p _i is a prime and k _i is a positive integer. Fractions can be displayed in any order. One fraction can be selected no more than once.

If there are several solutions, output any. If there is no solution, output -1 in the first line.

Parsing task C

Note that for N from 10 ⁵ to 2 ∙ 10 ^{5 the} number of prime numbers not exceeding N is less than N / 10 .
We show that if you want to choose P (N) or more fractions, where P (N) is the number of prime numbers not exceeding N , then the problem always has a constructive solution.

We will choose as follows. First choose the fraction N / 2 . If N is even, the answer is received. If N is odd, choose a fraction with the smallest simple divisor of N as the denominator. If the numerator is even, the first two fractions are the answer. If the numerator is divided by the denominator, the second fraction is the answer. At each next step, we try to choose a fraction with the smallest simple divisor of the previous numerator as the denominator. If it turns out that this number was already in the denominator at the i- th step, then the section from the i- th fraction to the last selected one is the answer. Otherwise, select the appropriate fraction. In the worst case, we thus choose all prime numbers not exceeding N as denominators, after which repetition will inevitably happen.

Task D. Unite and conquer

The author of the analysis and tasks is Gleb Evroprop.

Not so long ago, Yandex employees patented an algorithm for optimal fragmentation of a large set of ordered data for transmission over the network. Now they are faced with the task of searching for an already fragmented pattern in fragmented data. It sounds difficult enough, but they are ready to challenge this task, especially with your help. We will not go into their motivation and immediately proceed to the formulation of the problem.

There is an initial sequence a _i , consisting of n numbers from 1 to 5, describing the sizes of consecutive parts in the fragmentation of the original data. There is also a sequence b _i consisting of m numbers from 1 to 5, similarly describing the sizes of consecutive parts in a sample fragmentation.

For one action, it is allowed to select any two neighboring elements in one of two sequences and combine them, replacing them with one element with a value equal to their sum. Thus, from sequences 1, 2, 2 in one operation, only sequences 3, 2 and 1, 4 can be obtained. After performing this operation in sequence, an element that is more than 5 may well appear. It is required to calculate the minimum number of actions that need to be done with these sequences so that the sequence b _i becomes a substring of the sequence a _i .

Input format The first line of the input contains two integers n and m - the number of fragments in the initial data partitioning and the number of fragments in the partitioning pattern for the search, respectively (1 ≤n, m ≤ 200,000) .

The next line contains n numbers a _i , each from 1 to 5 inclusive, specifying the sizes of the fragments in the initial data partitioning.

The last line contains m numbers b _i , describing the sizes of the fragments in the pattern partitioning in a similar format.
Output format Print a single integer - the minimum number of actions necessary for the second sequence to become a substring of the first. If it is impossible to achieve this, then output -1 .

Parsing task B

We will learn to solve for the beginning, in polynomial time, the problem of checking that there is some positive answer. Enumerate two boundaries in the original sequence. How to check that the sequence b _{i is} in principle possible to display a _{i i} in the selected fragment? Check whether the sum of all elements b _{i is} equal to the sum of the selected elements a _i . It is easy to see that this condition is necessary and sufficient, since without limiting the number of operations you can simply merge all the elements into one.

The algorithm described above has complexity O (n ² + m) . Note that since all elements are positive, the sum on the subsegment will be a strictly monotonic function on both boundaries. Using the method of two pointers, we solve the problem of checking the existence of a response in O (n + m) time .

Suppose we know that the sum of the elements of the sequence b _i is equal to the sum of the elements in some sequence c _i (the sub-segment a _i ). Determine the minimum number of merge operations needed to make these sequences the same. Consider an arbitrary positive number x such that both sequences have a prefix equal to x . One of these numbers x will match the first character in both sequences after all the joins have been completed. It is easy to see that if there are two different x ₁ and x ₂ (x ₁ <x ₂ ) that meet the above requirement, then the answer with the minimum number of unions (and therefore the maximum length) is never beneficial to start with x ₂ .

Indeed, since each element in the final answer corresponds to sub-slices in each of the sequences b _i and c _i , this sub-segment can always be divided into two, which will form the elements x ₁ and x ₂ - x ₁ . From this it follows that the optimal answer can be typed greedily, using all x such that both sequences have a prefix equal to x . The number of such x will determine the length of the response, and hence the number of operations.

We learned how to isolate O (n) interesting sub-splices and find the answer for each of them in O (n + m) time . This is still too slow. We calculate all partial sums in both sequences and mark them in the bitmap. Then the calculation of the number of matching elements is reduced to the calculation of the scalar product of two bit sequences. You can calculate the result of the dot product of one sequence with all the suffixes of the other using the fast Fourier transform. The final complexity of the solution is: O (cn · log (cn)) , where c is the upper bound on the elements a _i and b _i .

Problem E. Compact strings

The author of the task and analysis is Michal Foriszek. Finalist of Yandex. Algorithm 2013, author of ACM ICPC, IOI.

This task deals with strings of lowercase Latin letters (“a” - “z”). A string is called compact if there are only the same letters between any two identical letters in this string. For example, the strings “yandex”, “aacccb”, and “eeeee” are compact, and the strings “aba” and “aazbbzc” are not.

You are given a pattern consisting of lowercase Latin letters and question marks ("?"). Each question mark represents one arbitrary lowercase Latin letter. Find the number of different compact strings that match the specified pattern. Since the answer can be very large, output the remainder of dividing it by 10 ⁹ + 7 .

Input format The first line of the input contains an integer t - the number of test cases (1 ≤ t ≤ 100) . Each of the following t lines defines one test case, which is a pattern of at least 1 and not more than 10 ⁴ characters inclusive. The template can consist only of lowercase Latin letters and question marks.

Output format For each test case, output a single integer - the remainder of dividing the number of corresponding compact lines by 10 ⁹ + 7 .

Parsing task E

First, consider a simpler task: given a pattern, is there at least one compact string that corresponds to it? Such a problem is simply solved in linear time: you need to remove all question marks and check whether the resulting string is compact. Accordingly, the first step of our decision will be such a check; if it fails, output 0.

After that we will make some pre-submission that will help us in the future: whenever we see a string of consecutive question marks, to the right and left of which there is the same letter (for example, z ??? z ), we notice that question marks can be uniquely disclosed (will be equal to this letter itself), so that in the context of this task this line can be replaced by exactly one such letter (that is, the fragment z ??? z changes to z ).

Further we will use the following notation:

N to denote the length of this pattern (in this problem no more than 10 ⁴ );
S to indicate the size of the alphabet (in this case, 26);
U to indicate the number of letters of the alphabet that are not found in this pattern.

If U = 0 , then the problem is solved simply. For each block of consecutive question marks, it remains unknown only where the first letter ends and the second begins.

For example, in block a ???? b , question marks can be replaced with a few (possibly zero) letters a at the beginning, and the remaining question marks will be replaced with letters b . The number of options for each block is thus equal to the number of question marks plus one, and since the blocks are independent, the total number of options is equal to the product of these numbers.

In the general case, when U ≠ 0 , additional options appear: we need to determine which of the letters that are missing in the template are present in the summary line and where. Note that each of these letters must also form a continuous substring of the result string, so that if such a letter meets somewhere, then all its occurrences must belong to the same block of consecutive question marks.

There are several ways to effectively count the number of such lines. One of the possible approaches uses a fairly standard way: we will try to recursively generate all these lines, adding one letter at a time, then we will replace the generation with a count with memoization. In case the corresponding actions are done correctly, as a result we will get a memorable recursive function that will respond to any request in O (NU) time .

The recursive function will take the following arguments:

The number of characters in the template that remains to be processed.
Number of letters not yet used.
Type of last character processed.

The type can take one of four values:

fixed : the last processed character was a letter, which we put in a string, because they must.
past : the last character processed was a question mark, and we decided to replace it with the last letter that was processed.
future : the last processed was a question mark, and we decided to replace it with the next letter, which will occur in the template in the future (since our task is only counting, we don’t need to know what letter it is).
unused : the last character processed was a question mark, and we decided to replace it with a letter that had not been used before.

Knowing the type of the previous letter, we know what opportunities we have for the disclosure of the next question mark. For example, if the type was future , then the future type will be for all letters from the block.

In addition, there is another approach that gives a slightly faster solution. Note that after initialization (checking for the existence of a solution and replacing all trivial blocks with one letter), we can receive no more than S + 1 blocks of consecutive question marks: one can be at the beginning of a line, and before starting any of the other blocks, some is a letter. After initialization, all these letters will be different in pairs. Thus, the number of blocks cannot exceed the number of letters in the alphabet by more than 1.

Now we can calculate all the compact substrings that match the patterns using combinatorial methods. For this, we use a recursive function that will process blocks of consecutive question marks. Thus, the function will have two arguments: the number of blocks that remains to be processed, and the number of letters not currently used.

Processing the new block, we will consider all the options for the number of unused letters that we want to use in this block at least once. Thus, the answer will be equal to the product of the number of ways to select a specific set of unused letters (binomial coefficient) and the number of ways to arrange the already selected letters (factorial) and the number of ways to choose places where the chain of letters of one type ends and the other begins (again, the binomial coefficient) on the number of ways to fill the remaining blocks using the remaining unused letters (the result of a recursive call).

Thus, we get a solution that uses O (NU) time to build a table of binomial coefficients for a given module and responds to a single request in O (SU ² ) time .

Problem F. Random points

The author - Ivan Kazmenko. Graduate of St. Petersburg State University, author of the tasks of the championships of St. Petersburg State University and the Open Cup.

Consider the following linear congruential pseudo-random number generator. The generator has a state - an integer s (0 ≤ s <2 ³² ) . Every time when another pseudo-random integer is required in the interval [0, X) , the generator performs the transformation S _new = (a ∙ S _old + c) mod2 ³² . After this, S _new becomes the new state of the generator, and the result is the number

.

In this task, a = 134 775 813 and c = 1 — such values are used in the built-in pseudo-random number generator in Borland Delphi 7. This generator is used to obtain n random points on a plane whose coordinates are integers from the interval [0, 100 000 000 ) . Consider two ways to generate points.

In the first method (codename RAW), the generator is driven to some initial state s , after which it successively generates 2n pseudo-random numbers a ₁ , a ₂ , ..., a _2n in the desired interval. After that, n random points are given by the following coordinate pairs: (a ₁ , a ₂ ), (a ₃ , a ₄ ), ..., (a _{2n - 1} , a _2n ) .

In the second method (codename SHUFFLED), the beginning of the process is the same: the generator is driven to some initial state s, after which it successively generates 2n pseudo-random numbers a ₁ , a ₂ , ..., a _2n in the desired interval. However, then these _2n numbers are rearranged randomly using the following version of the Fisher-Yeats algorithm (pseudo-code):

for i = 1,2, ..., 2 ∙ n:
k = random [0, i) + 1
swap (a _i , a _k )

Here, random [0, i) is a pseudo-random integer from the interval [0, i) for which generation the same generator is used, and swap (a _i , a _k ) swaps the numbers a _i and a _k ; if i = k , nothing happens. Before starting the execution of the pseudo-code, the generator is in the state in which it was after generating the numbers a ₁ , a ₂ , ..., a _2n

After the numbers a ₁ , a ₂ , ..., a _{2 n} are rearranged, n random points, as in the first method, are defined by the following coordinate pairs: (a ₁ , a ₂ ), (a ₃ , a ₄ ), ..., (a _{2 n - 1} , a _{2 n} ) .

A sequence of n points is specified (10,000 ≤ n ≤ 100,000) . It is known that it is obtained by one of the two methods described above. In this case, it is unknown which method was chosen, as well as the initial state s . Find out which point generation method was used.

Input format The first line contains an integer n - the number of points (10,000 ≤ n ≤ 100,000) . The following n lines contain point descriptions, one per line. Each point description consists of two integers separated by a space — its x and y coordinates (0 ≤ x, y <100 000 000) . It is guaranteed that the points are obtained in one of two ways described in the condition.

Output format
Output "RAW" if the first method of generating points was used, and "SHUFFLED" if the second method was used.

Solution of Problem F

General observations. It is easy to show that the random numbers generated form one cyclic sequence of 2 ³² numbers. The only thing that depends on the initial state is which element of this cyclic sequence will start generating.

The first decision. This solution is probabilistic, that is, it gives the correct answer with a certain probability close to unity. The basic idea is to come up with some property that is most likely inherent in only one of the two ways of generation, and then check that this property is either satisfied, or with high probability is not met.

Here is a simple example of such a property. Fix a small number k (for example, k = 10 ). Let the coordinates of n points be generated sequentially. Then any continuous subsequence of points of length k is also obtained by sequential generation of random numbers, starting from a certain state. And when generating n points with coordinate mixing, the probability that some kind of continuous subsequence of points of length k is obtained by sequential generation of random numbers is very small.

The solution testing this property may look like this. t (, t = 10 ⁶ ). t : k . , k n . , , . , , .

, k t 10 ^{- 10} .

O(k) O(k log n) , , O(1) (-) O(logn) ( ). O(n + tk) - O(n log n+tk log n) .

. , . , (x ₁ ) . ,

. , s

. 43 s , x ₁ _. 2n - 1 : y ₁ ,x ₂ , y ₂ , ..., x _n , y _n . - s , , . s — , .

, : s , . .

Source: https://habr.com/ru/post/264253/

All Articles