Analysis of tasks from the International Olympiad in Informatics IOI 2016

In August of this year, the International Programming Contest for Schoolchildren - IOI 2016 was held in Kazan. The Russian team was second in the overall standings .

One of the silver medalists, Denis Solonkov from Mytishchi, made an analysis of the “Molecular Detection” task, which was offered to the participants of the Olympiad.
')
Denis Solonkov is a multiple winner of the All-Russian Programming Olympiad and the Moscow CTF School, a graduate of the School of Programmers, now a HSE student.

As one of the teachers of Denis, I asked him to do a task analysis with IOI 2016.

The task

Peter works for the company that created the device for detecting molecules. Each molecule has a whole positive weight. The device is characterized by a detection interval [l, u], where l and u are positive integers. A device can detect a multitude of molecules if and only if this set contains such a subset that the total weight of the molecules in it belongs to the detection interval of the device.

Watch full

More formally, we consider n molecules with weights w0, ..., wn − 1. A detection is considered successful if there are many different indices I = {i1, ..., im} such that l ≤ wi1 + ... + wim ≤ u.

Due to the nature of the device, the difference between l and u is guaranteed to be greater than or equal to the difference in weights between the heaviest and lightest molecules. More formally, u - l ≥ wmax - wmin, where wmax = max (w0, ..., wn − 1) and wmin = min (w0, ..., wn − 1).

It is required to write a program that either finds any subset of molecules with a total weight belonging to an instrument detection interval, or determines that such a subset does not exist.

Implementation details

You should implement one function (method):

int [] solve (int l, int u, int [] w)

l and u: detection interval boundaries,
w: molecular weights.

If the required subset exists, then the function must return an array of molecular indices that form any such subset. If there are several correct answers, return any of them.

If the required subset does not exist, then the function should return an empty array.

For the C programming language, the function signature is slightly different:

int solve (int l, int u, int [] w, int n, int [] result)
n: number of elements in w (i.e. number of molecules)
the remaining parameters are the same as described above.

Instead of returning an array consisting of m indices (as indicated above), the function should write the indices into the first m cells of the result array and then return m.
If the required subset does not exist, then the function should return 0, without writing anything to the result array.

Your program can write indices to the returned array (or to the result array for C) in any order.

Please use the provided file templates to clarify the implementation in your chosen programming language.

Examples

Example 1
solve (15, 17, [6, 8, 8, 7])
In this example, there are four molecules with weights of 6, 8, 8, and 7. The device can detect subsets of molecules with a total weight of 15 to 17 inclusive. Please note that 17 - 15 ≥ 8 - 6. The total weight of molecules 1 and 3 is equal to w1 + w3 = 8 + 7 = 15, thus the function can return [1, 3]. Other possible correct answers are [1, 2] (w1 + w3 = 8 + 8 = 16) and [2, 3] (w1 + w3 = 8 + 7 = 15).

Example 2
solve (14, 15, [5, 5, 6, 6])
In this example, there are four molecules with weights of 5, 5, 6, and 6. It is required to find a subset with a total weight of 14 to 15 inclusive. Again, note that 15 - 14 ≥ 6 - 5. For this example, there is no subset of molecules with a total weight from 14 to 15, respectively, the function should return an empty array.

Example 3
solve (10, 20, [15, 17, 16, 18])
In this example, there are four molecules with weights of 15, 17, 16, and 18. It is required to find a subset with a total weight from 10 to 20 inclusive. Again, note that 20 - 10 ≥ 18 - 15. Any subset consisting of one element has a weight from 10 to 20, respectively, the possible correct answers are [0], [1], [2] and [3].

Grading system
(9 points): 1 ≤ n ≤ 100, 1 ≤ wi ≤ 100, 1 ≤ u, l ≤ 1000, all wi are equal.
(10 points): 1 ≤ n ≤ 100, 1 ≤ wi, u, l ≤ 1000, and max (w0, ..., wn − 1) - min (w0, ..., wn − 1) ≤ 1.
(12 points): 1 ≤ n ≤ 100 and 1 ≤ wi, u, l ≤ 1000.
(15 points): 1 ≤ n ≤ 10 000 and 1 ≤ wi, u, l ≤ 10 000.
(23 points): 1 ≤ n ≤ 10 000 and 1 ≤ wi, u, l ≤ 500 000.
(31 points): 1 ≤ n ≤ 200000 and 1 ≤ wi, u, l <231.

Sample Tester Module
The validator receives data in the following format:

Line 1: integers n, l, u.
Line 2: n integers: w0, ..., wn − 1.

Decision from Denis

The task is already well formalized. We need to choose a subsequence of numbers from an array w of length n, so that their sum lies in the interval [l, u]. Also, there is a strange condition: u - l ≥ wmax - wmin (1).

It obviously plays a role in solving the problem, otherwise it simply would not exist. Let's try to understand what it gives us.

Suppose we choose a subsequence of elements with sum = S and S <l. In this case, if we replace one of the elements of the subsequence by some other, not chosen by us, then the new sum S '≤ u. A case that increases the sum by the maximum possible number is the replacement of the minimum by the maximum element, and according to the condition u - l wmax ≥ wmin.

Armed with this fact, we proceed directly to the solution. First, sort the array w so that it is more convenient to work with it. We divide our task into two points.

Determine L - the number of elements in the correct answer, or say that it does not exist.
Find L elements, the sum of which lies in [l, u]

First, let's look at the second paragraph. Suppose we know that the answer exists and contains L elements. Take the first L elements of the sorted w. Their sum is exactly ≤ u, since we agreed that the answer exists. Now we will execute the following algorithm.

If the current amount is greater than or equal to the lower limit, terminate the algorithm
Replace the minimum element from our sample with the maximum element that has not yet been selected. In the event that such a replacement does not increase the amount, then terminate the algorithm without performing it.

Note that due to (1) we will never cross the upper boundary. Why does this algorithm exactly find the answer? Suppose the opposite.

In each iteration, the algorithm changes the minimum element of our sample to the maximum one not selected. If such an element is less than ours, then this means that we have chosen the L last elements of w. If the sum of L maximal elements is less than l, then there is no suitable subsequence of length L, however, we agreed on the opposite. Contradiction.

Such an algorithm can be quickly implemented using the built-in C ++ implementation of a balanced search tree: set. The number of iterations of the algorithm in the worst case: n, the asymptotics of such a solution will be O (n log n).

Let us return to the first stage of the solution of our problem, namely, in the search for a suitable L. We can simply check each of them and get the asymptotics O (n2 log n), but this will work too long.

Let's find a length i such that the sum of the first i elements is greater than u. If such a length does not exist, then we assume that i = n + 1 Of all such, we take the minimum. I contend that if the correct answer exists, it will contain i - 1 element. Of course, there may be other answers, but one of them will contain so many elements. Let us prove this statement.

Let L = i - 1, or L = n, if i does not exist. The sum of the first L elements ≤ u, since L <i. Run the subsequence search algorithm described above. If he finds the answer, then our statement is true. Otherwise, the sum of the last L elements <l. Then the answer also does not exist for any length less than L. Just because the L max was taken, so any less quantity will also be less than l. The answer also does not exist for all lengths greater than L, since the sum of the first i elements is greater than u. Hence, the answer does not exist.

So we can find the required length in O (n), and find the answer in length in O (n log n). This means that the running time of the entire algorithm will be O (n log n), which is quietly gaining 100 points.

Solution Source Code

#include <algorithm>
#include <vector>
#include <set>

#include "molecules.h" // Including grader file

using ll = long long ; // A little alias to save time.

using namespace std ;

vector < int > find_subset ( int l, int u, vector < int > w ) {
int n = w. size ( ) ;
vector < pair < int , int >> weight ( n ) ;
for ( int i = 0 ; i < n ; i ++ ) {
weight [ i ] = { w [ i ] , i } ; // Storing initial index for the output.
}
sort ( weight. begin ( ) , weight. end ( ) ) ;
ll cur_sum = 0 ;
int bad_i ;
for ( bad_i = 0 ; bad_i < n ; bad_i ++ ) { // Locating first bad length
cur_sum + = weight [ bad_i ] . first ;
if ( cur_sum > u )
break ;
}
if ( bad_i == 0 ) // no solution
{
return vector < int > ( ) ;
}
set < pair < int , int >> picked, remain ;
ll curpicked = 0 ;
for ( int i = 0 ; i < bad_i ; i ++ ) {
picked insert ( weight [ i ] ) ;
curpicked + = weight [ i ] . first ;
}
for ( int i = bad_i ; i < n ; i ++ ) {
remain. insert ( weight [ i ] ) ;
}
while ( curpicked < l && ! remain. empty ( ) ) {
if ( picked. begin ( ) - > first > = remain. rbegin ( ) - > first ) // Nothing left to swap
break ;
// Swap the lowest and highest elements.
curpicked + = remain. rbegin ( ) - > first - picked. begin ( ) - > first ;
auto el1 = * picked. begin ( ) ;
auto el2 = * remain. rbegin ( ) ;
picked erase ( el1 ) ;
picked insert ( el2 ) ;
remain. erase ( el2 ) ;
}
if ( curpicked < l ) { // no solution
return vector < int > ( ) ;
}
vector < int > answer ;
for ( auto el : picked )
answer. push_back ( el. second ) ;
return answer ;
}

Solving such problems requires serious training, which can be obtained from the School of Programmers . This year, ShP opens a new branch - in the Physical Park, IT-park near the MIPT. The training program is complicated, the same as in the department at Yandex , which includes the study of high-level and low-level languages, discrete mathematics, algorithms, data structures, networks, operating systems and so on.

The recruitment for the 2016/17 academic year is conducted for students in grades 6–10 based on the results of the exam, which will be held on October 8, 2016 at 14:00 . Register for the exam and take a preparatory online course here .

The School of Programmers also has an online department. The lessons are held in the format of webinars, students from all over Russia are accepted, starting from 13 years old. The nearest entrance exam will be held on October 15 at 17:00 . You can register for the exam at the School of Online Programmers and take a preparatory course here .

Source: https://habr.com/ru/post/312018/

All Articles