I continue the story about how we did the SQL Olympiad . This is a continuation of the previous article, in which everything just did not fit.
Summary of the previous series: two absentee rounds of the Olympiad took place in December 2016 and March 2017, respectively, where the contenders for the victory underwent a rigorous selection both with theory and with the practice of using SQL in Oracle databases. Next about the third round - the face-to-face finals of the Olympiad in Sochi in early June 2017.
Disclaimer Just in case, I will repeat what my role was, and then after the first part of the article it turned out to be somewhat less obvious. I myself am not the organizer of this Olympiad: announcements and advertising of the event, the search for sponsors, work with universities, the organization of events in offline - all this was done by other people. Our company was a sponsor and prepared several contests at the Olympiad, and personally I was preparing the technical (and only technical!) Part of the competition in SQL. That is, I prepared the rules of the stages, prepared and checked the tasks. It would seem not so much, but this is the very thing for which everything else is actually done. So, too, I will not be too shy.
To make it clear what this is all about here, here's a report on the finals of our 2016/17 Olympiad on the official website .
Link to the photo album from the Olympiad site, and then finding it there is somehow not obvious.
At the finals, the organizers convened us and the participants in Sochi. Not a bad place. The end of May was originally planned, but the massive celebration of school graduations led to difficulties in organizing our events. I had to move for a week on the first days of June.
But first things first. First, about the preparation.
I repeat that the initial layout of the tours was as follows: the first round was a hard test filter according to the theory, the second was the practice of magic problem solving, the third was a blitz. Since in-person mode participants will have to solve problems in real and very limited time, then no matter how cool you will be blitz. And blitz is certainly not the best way to determine the best programmer. And "not the best" - this is an understatement. But nothing more fair to think failed.
For some time I tried to come up with some kind of fixed idea to which the whole final would be devoted, so that each solved task would advance the participant forward, opening new tasks, and the number of these tasks would increase with each already solved one, but the idea would not took shape in nothing real. Could not catch the tail.
Time was running out and stopped at such a scheme. All tasks are encrypted in the database. The solution of each problem allows deciphering the next task and so on along the chain. Then everything is simple: the one who moves the farthest away will win. If two advanced equally far in tasks, then the one who was the first won. Decryption of tasks made a stored function, which at the same time logged access to it. This allowed to monitor the situation in real time. Also, the decryption function was specially made computably complex, so that one of its launch worked for a couple of seconds. This is to exclude the possibility of solving tasks brute force. We also made sure that in the prepared data of these very brute force options, if anything, was more than enough.
Such a scheme for the finals has a number of undeniable advantages. First, the results can be monitored in real time. Secondly, all participants are in absolutely equal conditions, there are no grounds for appeals. The main disadvantage follows exactly from these very merits - if suddenly the brain is “stuck” and the solution is stalled, then there is no further to go. For blitz, this is of course not quite healthy, because it is annoying, and all participants will face this in one form or another. I can not even imagine such a person who would never “get stuck” on such tasks.
But I really wanted to try just such a format. The alternative is clear: tasks, points for each, automated verification of results - those who scored more points, he won. I do not like this approach, it is also clear that - balancing the weights of the tasks (points) in the final result. It is very easy to miss and the results can suddenly be unfair. And there is no opportunity to debug these weights, the final is the only one. If once again it happens to hold the Olympiad, I will still think about how best. Now with the experience.
So, with the approach decided, you need to prepare tasks. Again, I wanted to have tasks, how to put it in words, non-olympic. That is, their solution should not require special knowledge of number theory, special algorithms or something else like this. It would be desirable for the solution of each task to be aware of the condition, to imagine a declarative way of expressing the solution and implementation without any special problems. Well, to look like the tasks of the second round, only simpler. In the second round, the target time for solving the problem was a day, and for the third - half an hour. As a result, the task was hard to invent. It is precisely because of this balance that it is not too hard to decide, but at the same time it is not trivial. We also do not forget about such a feature that in the tasks you need to get the correct result, and the solution can be arbitrarily florid, even from the ceiling, write down the answer. We didn’t control the solution anywhere; only the tasks were compiled in such a way that the easiest way to solve them was in SQL.
In order not to be bored, in the tasks for the hike we also introduced the participants to the interesting vocabulary objects of the oracle base. For those who do not know, in modern databases there is a dictionary in which there is all the meta information on the contents of the database: which tables have it, with which fields, which procedures are functions and a lot of other useful information, to the extent that right now and how in the database happens. For the same reason that it was more fun, the texts were designed in the style of Gregory of Oster.
That's what happened. Indication of topics at the end of each task was not part of the condition, these are our working notes, here they are relevant.
The answer to the most important question about life, the universe and all that is the key to decipher the next task.
Answer: 42
Solution: the solution to this problem takes too long (approx. 7.5 million years), you just need to know the answer.
Admin Vasily created a table in the database. He liked it, and he created another one. And so another 400 times. As he later noted, the tables turned out to be all different, but two of them looked alike a little less than completely, differing only in the name of one field. Find and you two tables created by Vasya, which differ in exactly the same field name. Table names, written one after another in lexicographical order, are the key to decipher the next task.
Hint: use the system view DBA_TAB_COLUMNS, the Vasin tables are in the VPOUPKINE schema.
Topics: Dictionary (DBA_TAB_COLUMNS), subqueries, aggregation
After Vasya created the tables, the user Klava proceeded to complete them. And I filled one table from the previous task with some Very Important Data. But, at some point she was distracted by conversations with a colleague and, automatically closing the program window, lost the record in which she entered the data. The last thing she remembered is that in this record one of the fields was filled with a string that matches the name of the field. Admin Vasya managed to find Klava this record and help her to continue her hard work.
It is necessary to find a line in the table from the answer to the previous task, in which the value of one of the fields (no one knows what) matches the field name. The name of this field and the ROWID of this record, written down one after another, are the key to decrypt the next task. Hint: Use the DBA_TAB_COLUMNS view.
Topics: Dictionary (DBA_TAB_COLUMNS), ability to generate queries by other queries
Solving the previous problem, Vasya thought. Two hours later, he realized the idea that he now knows how to use mental cofg-fu and it does not cost him to search for arbitrary values ​​in arbitrary fields of arbitrary tables. And even more. For example, along the way, numeric fields can be added. It so excited him that he even sweated. And Vasya decided to search among his tables for one in which the sum of all the numeric fields is, say, 12345. And such a table was found! ROWID of this line is the key to decrypt the next task.
Topics: Dictionary (DBA_TAB_COLUMNS), ability to generate queries by other queries
While Vasily was rummaging through the tables, his colleague Zhenya Smith secretly decided to make his base with abacus and scheherazad. For a start, he, following Vasya's trail, has meditated on the creation of tables a thousand and one times. Then he thought what to do next. Vasin brought a voice out of his trance:
- And the names of these two tables of yours are obtained from each other by a cyclic shift of letters!
- What?
- Well, look, if the beginning of the name of this table is transferred to the end, then you’ll get exactly the name of that table.
- Where?
Vasya showed. Tables are in the JSMITH schema. The names of these two tables, written one after another in lexicographical order, are the key to the next task. An example of a cyclic shift of two letters: ABCDEF -> CDEFAB. You can use the DBA_TABLES view.
Topics: dictionary and string functions, DBA_TABLES
Finally, Vasya condescended to the needs of the leadership and agreed to make a report, which he had been asked for for two months. The report required to deduce the structure of the organization from table SQL2017.ORG_STRUCTURE as follows. Print the names of all employees in alphabetical order. If one of the employees has subordinates, then indent under the head and indent all his subordinates with indentation again, in alphabetical order. If someone from the subordinates has his subordinates, then again indent and deduce subordinates in alphabetical order, and so on. Here is an example:
... ------------------------ -------- ----
The subordination of employees is set by the field MANAGER_ID, which refers to the manager ID, the names of the employees are in the NAME field.
Having made a report, Vasya noticed that if it was launched for Charlotte Petrovna (ID = 716), the ladies, to whom Vasya himself was obviously not indifferent, and look in the HINT field, then this field opens the key to the next task.
Topics: hierarchical queries, sorting with SIBLINGS
Once Vasya and Zhenya started a dispute about independence. And Vasya even made one independent judgment, but at the checkpoint he was asked for an invoice to take away and had to bring the judgment back and put it in the corner of the room. And the dispute continued with a new force, but now about dependence. Dependencies were searched for where it was necessary and where it was not necessary, and in the end they found it in their favorite database, and in quite large numbers.
It quickly became clear that if a database object is dependent on another, then DBA_DEPENDENCIES has a record about it: OWNER.NAME depends on REFERENCED_OWNER.REFERENCED_NAME. The chain of dependencies can continue even further if someone also depends on the next object in the chain.
Independent objects are of no interest, nothing depends on them anyway. But to find the most dependent was an interesting thing. And even more interesting - to find the most "influential", on which all others depend. The name of the object on which the greatest number of other objects depends and the number of these objects written one after another is the key to deciphering the next task.
In order to protect the result from the actions of Vasya and his colleagues sitting next to you in the classroom, limit this task to the LOL scheme. It is believed that the object also depends on itself. The names of all objects in the schema are unique. The body and package specification must be counted as one object.
Topics: DBA_DEPENDENCIES, recursive queries.
Kohl Yok also sometimes worked for the needs of the company, although many thought it would be better
he did nothing. For example, he wrote the notorious report EJEKP.JMACCXBXUK,
which is constantly breaking down, and according to this report, the lunar weather center has long been
already made my own weather forecast. So once again the report stopped showing
the required data, and the general abraham was announced: no one from the workplace disperses
until the report runs. Unlike Kolya, who frantically tried
to analyze the work of the report, Vasily decided to go another way. And instead
look for who broke the EJEKP.JMACCXBXUK procedure, Vasya decided to prove his own
innocence, demonstrating that he personally did not change anything in the database,
what could affect. For this, he found a list of procedures and functions (more
subroutines) on which EJEKP.JMACCXBXUK does not depend. And all Vasin corrections
code fall into this list.
Assumptions:
Dependencies in DBA_DEPENDENCIES are detailed to the object level. In the same
We will be interested in the dependencies between the subroutines of packages. Let be
subroutine S1 of package P1 calls subroutine S2 of package P2. Then we say
that P1.S1 depends on P2.S2. We are interested in having a call in the code, not the possibility
make this call. for exampleIF FALSE THEN P2.S2; END IF;
We will treat as a challenge P2.S2.
Find all the subprograms of the KEK schema packages on which EJEKP.JMACCXBXUK is independent.
This requires analyzing the source codes of the packages using the table
DBA_SOURCE. We assume that all dependencies can be found from the source code.
packages of this scheme and that subroutines of other schemes are not called. No packages
comments. Package names and subroutines will always be located when calling
on one line with no spaces between them. The keywords PROCEDURE and FUNCTION and
their names will also be on the same line. Package and Subroutine Names
will be different from the names of all other objects. Only packages are advertised.
subroutines are forward declarations.
The answer will be a string of the names of the procedures and functions found, ordered
lexicographically and glued in one line separated by commas. Do not forget local
subroutines declared in the package body, but not in the specification!
Topics: DBA_SOURCE, recursive queries.
While Vasya worked for the needs of the company, Zhenya did something enthusiastically, occasionally casting mysterious glances at Vasya. Finally, unable to bear it, Vasya asked:
- Well, what have you got there?
Zhenya explained:
- Abacus do. Look, there are numbers in my JSMITH.ABACUS nameplate. I select one number, then arbitrarily the second (maybe even from the same record) and add them. Now I take the third number (maybe also from already taken records) and the fourth as well, and I multiply these two by each other. And sometimes such sets of four numbers come across that the sum of the first pair is equal to the product of the second pair. And you know how many such sets were found?
- I don’t know yet, but now I’ll find out. It's elementary! - said Vasya, recalling his past exploits for Klava K.
But having written the request and having twisted for half an hour the inscription "Please wait", Vasya looked into V $ SESSION_LONGOPS and other similar places and realized that he would not have time to receive an answer this way until the end of the working day. The request was stopped. Having estimated something in mind, Vasya acted differently. And after a couple of minutes, he knew the answer. Find this answer.
The number of combinations of four numbers from the JSMITH.ABACUS table, where the sum of the first pair is equal to the product of the second pair, is the key to deciphering the next Task. All possible sets are defined as follows: in the first place - each of the numbers in the table, in the second place - each of the numbers in the table, in the third and fourth places - the same. For example, if the table had 10 numbers and all of them were zero, the answer to the problem would be 10 * 10 * 10 * 10 = 10000.
Topics: creation of additional tables, optimization (the difference between nested loops and hash join).
Vasya knocked Zhenya on the shoulder and said:
- And I made my abacus! And you know what I counted on it?
- Like me in the previous problem, the number of sums coinciding with the works?
- No, - Vasya grinned, - I acted easier. I counted how many sets of four numbers there are, where the sum of the first two is equal to the sum of the last two.
- So this is even easier! .. - Eugene exclaimed.
- And you try, - Vasya grinned.
And as the water looked, it was not so simple.
The number of combinations of four numbers taken from table VPOUPKINE.ABACUS, where the sum of the first pair is equal to the sum of the second pair, is the key to the next task.
Topics: logic, optimization (rewriting a query).
In the department where Vasya works, there are many open tasks to which no one has yet
proceeded. Vasya really wants to go on vacation as soon as possible, but for this he must decide
at least K of these tasks, so that KPI does not subside. Vasya represents the approximate time
solutions to each problem and wants to take in the development of K tasks, and the remaining tasks to give
other programmers. Assessing the time required to solve these K problems, Vasya decides to agree on the start of his vacation, so that he can buy tickets now. Vasya’s boss was very prudent and decided that Vasya could not have time to complete the last of his chosen tasks, and then he would have to transfer this partially solved task to another developer, which would take additional time. Vasya’s vacation was moved forward to provide this time reserve. Tasks have priority and must be performed by the developer in descending order of priority. The TASKS table contains task data. For each task, the name is NAME, the department number is DEPARTMENT_NUMBER, the priority is PRIORITY (the higher the number, the higher the priority), the estimated EXPECTED_TIME duration and the estimated transfer time of the unresolved task to another ADDITIONAL_TIME developer. Vasya chose exactly K from these tasks so as to go on vacation as early as possible. Those. the sum of the estimated durations of these K tasks and the estimated transmission time of the last (by priority) of these K tasks was the lowest possible.
There is more than one department in the organization, and each department has its own “Vasya”. In the table
DEPARTMENTS shows data by department: department number DEPARTMENT_NUMBER and
the number of tasks K that “Vasya” from this department should solve. Notice that
departments are completely independent in the sense that each “Vasya” can choose tasks
only his department.
It is required to determine when the vacation of each of "Vas" begins, given that
they chose tasks optimally. In other words, Vasya from the DEPARTMENT_NUMBER department will go on vacation in S days, where S is equal to the sum of the durations of the tasks selected by Vasya plus the time for transferring the last task to another developer. As a key, use the found numbers S, separated by commas without spaces, in ascending order of department numbers DEPARTMENT_NUMBER. Tables are in the VACATIONS schema.
Topics: sorting, logic.
Many years later. During this time, mankind stepped over the technological singularity, and Vasya still worked in the same company. But now Vasya could solve thousands of tasks in milliseconds, since his brain was already connected to a supercomputer with artificial intelligence. Despite all the innovations, Vasya still has a problem with “leaving early for vacation”. Since that time, a lot of statistics have been accumulated, and it was found that the lower the priority of the task, the less additional time is required to transfer it to another developer. It was also found that the probability that within one department there can be two tasks with the same execution time is negligible.
You need to do the same as in the previous task, with the change,
that the data now lie in the MEGAVACATIONS scheme, in the DEPARTMENTS and TASKS tables.
Consider that Vasya can now solve millions of tasks. Also note that the input
data satisfy the condition: for tasks TASK_i != TASK_j
from one departmentTASK_i.EXPECTED_TIME != TASK_j.EXPECTED_TIME
and from TASK_i.PRIORITY <= TASK_j.PRIORITY
follows TASK_i.ADDITIONAL_TIME <= TASK_j.ADDITIONAL_TIME
.
Topics: sorting, logic.
Oops! .. And this is no longer a task. This is, in fact, the finish. Congratulations!
You can lean back on the stool, smile, and gently wave the hand to the organizers so that they too will rejoice. But, mind you, without distracting other participants!
A couple of comments on the tasks. In fact, the finalist tasks were prepared six pieces, plus two bonus of increased complexity, if someone decides the main six. Tasks were arranged in order of increasing complexity. How to measure this complexity is incomprehensible, so maybe where they missed slightly. Plus, another variation of the second bonus problem with even more increased complexity, if suddenly there is a monster who will solve all previous tasks, along with bonus ones.
That is, in the first six tasks there are absolutely no dirty tricks; everyone who writes in SQL can solve them only if they have enough time. Yes, some additional knowledge and skills allow you to solve some of these tasks faster. In bonus tasks already perverted. In the first, this is a hefty amount of creatively analyzed data, in the second, this is optimization. Moreover, the difficulty lies precisely in the fact that it is necessary to guess that optimization is needed here, because the “head-on” solution does not work, it was so tailored. In an additional bonus task, algorithmic optimization is needed already, because the same method as in the previous task no longer works in it.
Further, to ensure that no one in the allotted time passed to the end, we set two truly olympiad tasks, because they were invented and it was a pity to throw them into the basket. And the cap at number 12 is really a congratulation for supermen, if there are any. Of course, we didn’t believe this, but as mathematicians and programmers by profession, we had to consider such a case.
Total expected that the participants will solve a maximum of six problems and get stuck at 7-8-9. Or earlier. The worst thing would be if all of a sudden stuck on the very first task. To them (the participants) that - "three hours of shame, then as usual", and we would frantically rush by what criterion to determine the winners. Probably it was necessary to protect yourself and the first item to set the task to some SELECT ROWID FROM DUAL
, so that one task was exactly solved. However, the calculations were correct, the first task was already decided.
Decryption of the task is done as follows:
SELECT sql2017.decrypt('---', encrypted_text) FROM sql2017_tasks WHERE task_number = 2;
In the process of preparing tasks for official purposes, a table generator, generators of labyrinths and correct arithmetic expressions (these are for the second round), a packet generator with code and required parameters for mutual calls, a simple human name generator were written.
Something as usual did not have time. For example, I could not think of how to give a quine task in the chosen format.
Now from the labyrinths of "pure mind" back to mortal earth.
Sochi greeted us with summer heat and greenery. Check in was on Thursday, on Friday the finals themselves in all nominations. Then a weekend with a cultural program and on Monday awarding and awarding of prizes.
We brought our server, with a prepared base. The participants' places were also brought with them (though not themselves, but our colleagues). It was decided that to create the same conditions, everyone would have the same pre-configured jobs. Personally, I was afraid of possible problems with connecting any exotic zoo, if you allow participants to use their computers. There is someone Krivorukov and stubborn - the trouble will be above the roof. So they brought their identical laptops. We put trustworthy SQL * PLUS on them for the fans (in fact, SQLcl) and SQL Developer, as a free GUI development tool from the manufacturer. I don’t remember that someone from my friends actively used SQL Developer in real life, but everyone was in the same conditions. Well, you don’t need to dramatize, you can build the necessary SQL query in anything, even in Notepad, and SQL Developer can do quite well.
We brought all this economy with us on Thursday to the Olympic University on Ordzhonikidze Street, where the Olympiad took place. Then there was an organizational ambush. We were given two small rooms for conducting the SQL finals. In one we would not fit in the number. Local networkers just twisted a pair of tables. Since there were no other possibilities for accommodation, we had to hold essentially two simultaneous finals in neighboring audiences. The only thing that could be done and what I changed was the seating. We would be completely out of hand if participants had the opportunity to see each other’s monitors. It is unlikely to pry on purpose, but to stumble across a stray eye and not start looking at what is written is too much for a person who has come to a dead end when solving a problem. Therefore, I complained about the efforts already made by networkers and resolutely arranged the tables in a circle. In the photos here and there our tables, which are characteristically arranged in a circle, flash.
Wires, sockets, loading and checking of laptops, setting up a connection to the database is nothing complicated, but for a long time, as all sorts of minor issues constantly arise. That twisted pair is over, then all the holes in the hub are occupied and we need another one where to hide the server so that no one steps on it. The network turned out to be isolated, respectively, you need DHCP to distribute IP addresses. It is easier to set up a single DHCP server than to constantly run and manually enter addresses on any changes to the network configuration. We need a gateway to the Internet to access the documentation - set up, earned. In short, minor troubles lasted until late evening.
And in the morning we were waiting ...
About the official opening, I will not say anything, although this has something to do with our SQL nomination, but we were not involved in any way. I can only say that there were a lot of people in the Olympiad. At the check in the lobby was not overcrowded. It is good that we registered and received passes the day before.
After the opening ceremony, we gathered our final participants in the hall and, as pioneer leaders, took them to the audience prepared for the final. Instructing, distributing logins, checking connections to the database, explaining the rules for conducting the final, checking all of the decryption of the test problem using the example of task No. 0. At the site of the first task, there is a stub. According to the logs of the decryptor with a pre-written request, we follow who has not yet been able to correctly decipher the test task. Oh, it works!
Is everyone ready? We change the first task from the stub to the real one, reset the logs. Go!..
We keep order and the situation. It is convenient to follow situations, we make a selection from the decryption logs table, we sort them by the number of the solved task and the decryption time - and the leaders are in front of us (carefully, the “live” code):
select l.oracle_user, name, task , (select min(sent_date) from sql2017_log l2 where l.oracle_user=l2.oracle_user and l.task = task_number)+3/24 solve_time , tries from (select oracle_user , max(task_number) task , count(1) tries from sql2017_log group by oracle_user) l , sql2017_users u where u.oracle_user = l.oracle_user order by task desc, solve_time asc
We are asked a few questions. Just a few for all the finals. We started after the opening ceremony and briefing at 12:15, finished at 15:30. The end time in essence was not too important, the main thing was to designate it in advance. We proceeded from the fact that from 15 to 16 was lunch. To detain participants for half an hour is normal, it is cruel not to let them go to dinner at all.
A few facts during the final. The first problem was solved after 4 (sic!) Minutes after the start. Three times during the finals, the leader changed, and the top three changed very dynamically. , ( ). ( , ), , , . . .
. , , , , . , , , - — . . , , . , , — .
, , . , ( ) ( ).
, , . , , , . , .
, — .
And finally. . , , , .
! . — ! , , - . , - , , , - . , , - . , . , , .
Source: https://habr.com/ru/post/350528/
All Articles