📜 ⬆️ ⬇️

How we did the SQL Olympiad

At the very beginning of autumn 2016, the management set me the task of preparing the technical part of the SQL Olympiad. After discussing the situation with my colleagues, including the former, I was poked (stuck?) Into an article where the task of building the shortest way out of the maze was solved in a declarative style of SQL. Having collected the parts of the request in one pile and running it on a real base, I whispered “magic! ..” and realized that there would be an Olympiad.


I think that the typical reader of Habr at olympiads has ever been, but rather as a participant, not an organizer. I also visited different ones, and it was always surprising to me why it was interesting at some olympiads, and mortal longing at others. I can show how this theater looks from the other side of the curtain, and how I tried to make this Olympiad one of those that are interesting. Intrigues, scandals, investigations - nothing will happen. But I will tell you how the tasks were prepared, what was expected of them and what was the result.


Introductory


2016/17 Olympiad was held for the tenth time, like an anniversary. Although the Olympiad is declared international, its main language is Russian, so personally I would call it rather the All-Union. I was preparing the technical part of the contest on the SQL programming language (there were also other nominations). In theory, the second organizer of this nomination was Oracle, but I never met their representatives. So the only trace left from Oracle is that the Oracle database with all the relevant SQL specifics was used.


Here is the report on the finals of our 2016/17 Olympiad on the official website: http://world-it-planet.org/press/news/detail.php?ID=323774


At this, the credits will be considered complete, I provided the reconnaissance. A couple of common words and I will move on to action.


Since no guidance was given to me by the management (full carte blanche), I determined the popularization of the SQL language as the main goal of the Olympiad. By the way, this is really one of the few declarative programming languages ​​actually used in life. And if databases appear on the horizon, especially industrial ones, which are with servers, then there are really no alternatives at all - the de facto industrial standard. Plus, many niche solutions use SQL-like query syntax when you need to select and group something or sort it. For example, JQL in JIRA or query language in the Sphinx Search Engine.


Yes, so, the main goal was set - to popularize SQL. So that more people know about SQL and write skillfully in SQL. Then we will have everything in order with the personnel reserve. Along the way, we will certainly advertise ourselves and look for talents. It goes without saying. But the main thing is to draw attention to SQL. And once popularized, it should be interesting.


The organizers announced the following layout of the competition: two absentee rounds and a full-time final. We define the general concept as follows. The first round qualifying, you need to weed out as much as possible those who are not in the subject. The second round is the most interesting. We give tasks like "but weakly in a single SQL query to do something there"? We give more time so that there is an opportunity to think, but limitedly, there was no time to sit around. An in-person finale in the same vein cannot be held, for three (yes, even if it’s even for all eight) hours of the final, it is problematic to come up with something like that in SQL. That is, there will be a blitz, where to go. This is certainly not the best way to determine the strongest programmer, but still a little better than just asking a random number generator.


And the preparatory work began to boil.


The first round qualifying. December 2016


Here, of course, I took advantage of the work of my colleagues from the same competition, but in the previous year, for which I thank them separately. The IT-Planet has a special engine that conducts testing. A database of more than a hundred questions with the answer choices to be selected was assembled. A participant is given 30 questions. Even if the participant took part last year, more than one or two questions are unlikely to recur. And it will not make anybody special weather.


So fundamentally, I didn’t change the questions, shaken up the existing ones, corrected some of the mistakes I noticed, something I probably left for the next generations to divorce. Such questions turned out to be vigorous, the level of a typical Prometric's online test. Plus, a special difficulty was made in the fact that in each question it was necessary to choose all the correct options, but how many of them there were correct were not indicated. In this case, only questions were counted, to which all correct answers are indicated. That is just to poke the mouse and get, as in some tests, the average 25% will not work even once. As a result, I myself, while checking the correctness of downloading questions / answers, with the full passage of all the questions regularly answered incorrectly to a couple of them.


It was expected that such a sieve would well distribute participants by the level of knowledge of SQL in general and Oracle SQL in particular, the more knowledgeable ones would take a higher position. Thirty questions for each participant statistically level out the "features" of individual questions. Plus, the therapeutic effect for the participants is that everything is serious and there will be no mercy.


Practice has confirmed the correctness of the expectations; indeed, the participants were fairly well distributed among Poisson (?) With seven questions correctly answered on average. One participant even managed to answer all 30 questions correctly (lucky, what could be there!), 29 participants already had 29 correct answers. Six people went clean, never guessing once. In short, what is needed. This approach allowed us to choose the appropriate passing score to form the necessary number of participants in the second round, selecting the best. According to the plans of the organizers, about two hundred participants had to go to the second round with some quite reasonable restrictions maximizing the diversity of participants across countries and universities.


The second round, the main. March 2017


The second round was undoubtedly the most interesting event of this entire Olympiad. It was in it that you could try your hand at SQL hardcore magic.


Since it’s still impossible to re-use these tasks, they have already lit up, I’ll bring them here. In part, this publication was conceived as a reason to share the prepared tasks for the further popularization of the SQL language. Each task, figuratively speaking, begins with the words "In one SQL query ..." and then comes Something Completely Different © Monty Python. Unfortunately, it was impossible to put tasks in one phrase, it was necessary to add verbose explanations to correctly formulate the condition, and also to give test data and an example of a solution on these test data for explanation.


I tried very hard for tasks to have a pronounced wow effect such as "is it really possible in SQL", and to be away from the traditional Olympiad subject requiring quite specific skills. The complexity of each task was primarily to present (well, and continue to implement) the declarative way to solve a completely independent and non-trivial task, even for classical programming. We also asked the participants to make a text explanation for each task. On the one hand, we were interested in people who can clearly express their thoughts, and they wanted to be encouraged. On the other hand, if all the invited participants had sent answers to the tasks of the second round, such explanations would greatly help us in checking. Looking ahead, I will say that of the 200+ participants in the second round, only 34 people responded. What is not so much, but (surprisingly) is approximately equal to the number of active participants in the second round in the previous year. This allowed us to carefully in manual mode and double check all the work, and call on the finals of the most deserving.


The second round of the organizers began on March 1. We gave a week and a half to solve five problems, even a little more, until 23:59 on March 12. So that the end of the period falls on the weekend after March 8. So that the participants could work hard in the last two days and at the same time not spoil their privacy too much. Test tests on themselves and colleagues have shown that each task takes half a day. Plus some time for registration, plus tune in and swing. That is, time should be enough, but without surplus.


Well, drove about the task. I'll hide it under the spoiler so that they take up less space.


Task number 1. The calendar

Write one SQL query that generates a pocket calendar. The parameters for the task indicate the year of the calendar and the number of rows and columns to form a matrix of months. Parameters are set by the following query:


with param(year, c, r) (…) 

where respectively


  • year - the year of the calendar
  • c - the number of columns of the calendar matrix
  • r is the number of rows of the matrix.

Months are located in the cells of the calendar matrix in order from left to right and then top to bottom. The numbers in each month are arranged according to the days of the week, the first day of the week in the first column, and so on. The beginning of the week should correspond to the database localization settings at the time of launching the request The name of the month is also taken from the localization settings and is displayed in the center above the numbers. Between the months you need to leave a gap so that the numbers of the neighboring months “do not stick together”. The very first line should be centered year. Empty lines should not be.


For example, with the following parameters:


 with param(year, c, r) as (select 2016, 3, 4 from dual) 

The following query output should result:


  2016    1 2 3 1 2 3 4 5 6 7 1 2 3 4 5 6 4 5 6 7 8 9 10 8 9 10 11 12 13 14 7 8 9 10 11 12 13 11 12 13 14 15 16 17 15 16 17 18 19 20 21 14 15 16 17 18 19 20 18 19 20 21 22 23 24 22 23 24 25 26 27 28 21 22 23 24 25 26 27 25 26 27 28 29 30 31 29 28 29 30 31    1 2 3 1 1 2 3 4 5 4 5 6 7 8 9 10 2 3 4 5 6 7 8 6 7 8 9 10 11 12 11 12 13 14 15 16 17 9 10 11 12 13 14 15 13 14 15 16 17 18 19 18 19 20 21 22 23 24 16 17 18 19 20 21 22 20 21 22 23 24 25 26 25 26 27 28 29 30 23 24 25 26 27 28 29 27 28 29 30 30 31    1 2 3 1 2 3 4 5 6 7 1 2 3 4 4 5 6 7 8 9 10 8 9 10 11 12 13 14 5 6 7 8 9 10 11 11 12 13 14 15 16 17 15 16 17 18 19 20 21 12 13 14 15 16 17 18 18 19 20 21 22 23 24 22 23 24 25 26 27 28 19 20 21 22 23 24 25 25 26 27 28 29 30 31 29 30 31 26 27 28 29 30    1 2 1 2 3 4 5 6 1 2 3 4 3 4 5 6 7 8 9 7 8 9 10 11 12 13 5 6 7 8 9 10 11 10 11 12 13 14 15 16 14 15 16 17 18 19 20 12 13 14 15 16 17 18 17 18 19 20 21 22 23 21 22 23 24 25 26 27 19 20 21 22 23 24 25 24 25 26 27 28 29 30 28 29 30 26 27 28 29 30 31 31 

Task number 2. Transfusions

There are two vessels with a capacity of 3 and 5 liters and a tap with water. You can pour water from one vessel to another (until one of them is filled or empty), pour water (only completely) and fill any vessel from the tap to the top. It is required to find all possible variants of transfusions that allow to measure exactly 4 liters. To eliminate the looping, discarded options where the state of fullness of vessels is repeated.


The SQL query should display all possible variants of transfusions in the following form: for each of the variants, output a chain of transfusions in the form of pairs of numbers indicating the amount of water in each vessel at the corresponding step. Numbers are separated from each other by a minus, and pairs - by a comma and a space.


It is required to solve the problem in general form, the parameters specify the capacity of the first v1 and second v2 vessels and the result of res, which must be obtained as a result of transfusions. Example conditions:


 with param(v1, v2, res) as (select 3, 5, 4 from dual) 

Example output:


 PATH --------------------------------------------------------------------------------- 0-0, 3-0, 3-5, 0-5, 3-2, 0-2, 2-0, 2-5, 3-4 0-0, 3-0, 0-3, 3-3, 3-5, 0-5, 3-2, 0-2, 2-0, 2-5, 3-4 0-0, 3-0, 0-3, 3-3, 1-5, 0-5, 3-2, 0-2, 2-0, 2-5, 3-4 ... 

Task number 3. About the way

There is an undirected graph given by vertices and edges. Given the length of the edges and the name of the initial and final vertices. It is required by a single SQL query to find a path with the shortest distance from the initial to the final vertex. The result should be presented in the form of two values:


  1. string path, lists the names of the vertices in minus,
  2. path length (sum of edge lengths).

You want to display the three shortest options, sorted by path length in ascending order. The problem statement specifies pairs of vertices and the distance between them (the graph is undirected, the distance is true for movement from the first vertex to the second, and back from the second to the first), and the initial and final vertices for constructing the shortest path.


The number of edges in the graph is reasonably limited, no more than 50 edges.


On the following test data:


 with edges (from_node, to_node, range) as ( select '', '-', 706 from dual union all select '', '', 516 from dual union all select '', '', 1793 from dual union all select '', '', 3956 from dual union all select '', '', 1345 from dual union all select '', '', 3356 from dual union all select '', ' ', 421 from dual union all select '', '', 6274 from dual union all select '', '', 856 from dual union all select '', '', 272 from dual union all select '', '', 3723 from dual union all select '', '', 4141 from dual union all select '', '', 666 from dual union all select '', '', 524 from dual union all select '', '', 9141 from dual union all select '', '', 237 from dual union all select '', '', 265 from dual union all select '', '', 146 from dual union all select '', ' ', 856 from dual union all select '', '', 1598 from dual union all select '', '', 2923 from dual union all select '', '', 790 from dual union all select '', '', 751 from dual union all select '', '', 2139 from dual union all select '', '', 2861 from dual ) , param(begin_node, end_node) as (select '', '' from dual) 

The following result should be obtained:


 PATH LEN -------------------------------------------------- ---------- --- 10480 ---- 10485 -- 10486 

Task number 4. Calculator

An arithmetic expression is specified as a string. It is required by a single SQL query to calculate the value of the expression. The expression contains the signs of the following operations: addition, subtraction, multiplication, division, exponentiation, unary minus and parentheses. Unary minus can occur only at the beginning of the expression or after the opening bracket. Numbers can contain a decimal part and the format fully corresponds to the type NUMBER in the current language settings of the database. For clarity, expressions may contain space characters. The length of the expression is reasonably limited: no more than 50 operands, the nesting depth of the brackets is no more than 20. The expression is always correct.


The request must return a single value in the NUMBER format.


Sample test data:


 with param(expr) as ( select '(-1 + 5^(1/2) ) / 2' from dual ) 

An example of the result of the query on this data:


  VAL ------------------------------------------------- ,61803398874989484820458683436563811772 

Task number 5. Labyrinth

A labyrinth of numbered lines is defined. The walls are marked with "#", the entry point is "s" and exit "e". You cannot go beyond the boundaries, the line numbers go in order, the length of all the lines is the same. It is required to draw the shortest path from the entrance to the exit (one of them, if there are several ways) using the “*” symbol in a single SQL query.


The size of the maze is limited so that the possible number of options for going through the maze (without internal loops) does not exceed 10,000.


Sample test data:


 with maze(linenum, line) as ( select 01,'## ### ######' from dual union all select 02,'s # # ' from dual union all select 03,'#### ### # ##### ' from dual union all select 04,' # # # ' from dual union all select 05,' # ### ######### ' from dual union all select 06,' # e' from dual ) 

An example of the result of the query on this data:


 ##***### ###### s**#* # ####*### # ##### #***# # # ###*######### # **********e 

Of course, if we had more time, we would have made the tasks even better. But so it turned out well. For each of the tasks had its own surprises and pitfalls.


For example, suddenly the first task, about the calendar, turned out to be very nontrivial. The participants already had a dynamically formed matrix of difficulties for the participants, and suddenly not all managed to correctly generate all the days of the year. Taking into account their leap year, they are usually 365 or 366. But in 1582 there are no days in our Gregorian calendar from October 5 to 14. Few managed to correctly process the year 1582. Plus, the week in different locales from different days begins. With so many unexpected ambushes, no one even stepped on a rake with the alignment of the names of the months. Anyone who got to this without breaking down along the way, used the functions that give the wrong length of the word in UTF8.


Problem number 3 "about the way" is a variation on the topic of the traveling salesman problem. Cities are taken those in which there are our offices, and the distances really correspond to the length of highways from atlases. Already sending the conditions, we found that we did not explicitly stipulate the presence / absence of cycles in the paths. Somehow it didn’t even occur at once to consider that the traveling salesman problem can be solved with cycles. When checking had to consider both approaches correct. But only if the participant permits cycles, then all the correct solutions in this approach will be found. Verification tests had to be refined. From the unexpected it turned out that not all participants are familiar with such a fundamental concept in programming as recursion. In the solution of one persistent person, the recursion was made manually on nesting depth up to level 11. For the test example from the condition of the problem, this was enough, for our other tests, of course not, there we including any exotic stuff. It was amazing to see it, I assumed that students and young specialists who write in SQL (and this obviously cannot be the first programming language) should know about recursion and be able to apply it in practice. Without this, there is absolutely nothing in programming.


Problem number 5 "labyrinth" is for me personally the most magical, it was she who inspired the entire second round of the Olympiad. Although after the first four tasks, it probably does not cause that effect. Under the conditions, it was specifically stipulated that a decision head-on with a search of all the options would work. By the way, long thought how this restriction in general can be formulated. An empty 7x7 labyrinth without walls is already shortly calculated, since the number of options for passing through it is large. , (, , ) 6060. . , , . . . , .


â„–4 "" . : - ( ), . , . - , , , , , "(( (-(( ((((( - ( (-(-(-( ( - ((-,0-,0))) ) ))))) ) ) ))))) ) ) " . , SQL , .


â„–2 , - . , (, brainfuck) , , SQL, . .


brainfuck. brainfuck SQL-. , , . , . .


, , . , . -, , . -, , , . . . -, , .


. , , . . , ( ) . , .


. - . . , . , , . , , , . ? , ? , . , , , ? . .


. SQL*PLUS . , .


, -. , . "with params as (select 1,2,3 from dial)" . -:


 set pagesize 999 linesize 999 numwidth 50 trimspool on trimout on set heading off verify off feedback off set serveroutput on size 999999 whenever sqlerror exit spool taskX.log -- test dataset here with param(num) as (select 10 from dual) @taskX.sql / exit 

taskX.sql taskX.log .


- taskX.sql , , param.num :


 -- = IT Planet, SQL 2016/17 = -- = Task X (Sample) = -- -- with param(num) as (select 10 from dual) , seq(n, fact) as ( select 1, 1 from dual union all select n+1, fact*(n+1) from seq, param where n < param.num ) select fact as factorial from seq, param where n = num 

, "" .


, - ( task[1,2,3,4,5].sql ) "" . , , . , .


, , . , . , - , , . , .


, , , . -.


To be continued! , .


PS , , , ! , , , — .


')

Source: https://habr.com/ru/post/349560/


All Articles