Author: Yuri Tsyganenko, Senior QATesting of new functions is often carried out on data taken from an already functioning system. In this case, testers sometimes have to build queries for tricky cases. For example, you need to test the new functionality of an online store, and the intervals between purchases play a role. We have access to data from a working version - you can download them to a test stand and test the operation of a new version of the product. (NB !: of course, when dealing with “live” data, you need to exclude private information from them and ensure that users are interested in login).
To select user accounts that we are interested in, we need to compare the maximum intervals between purchases from different users.
')
The tester is required to build a SQL query that issues N users who have the longest intervals between order dates.
Similar tasks and their analysis - under the cut.
It is about finding a sequence of records / intervals using standard SQL tools. Aggregate functions process all the data falling into the condition of the sample, and therefore it is impossible to do with them.
As an example, take the weather sensor, periodically issuing a state of "clear" or "overcast" and the force of the wind. Consider the tasks:
1. The first part of the data is displayed in the 'Weather' table, which includes the fields:
• time // Contains the measurement time;
• clear // Contains an assessment of the purity of the sky: let 0 be overcast; 1 is clear.
We need a request that issues several (say, three or less) longest periods (intervals) with clear weather. In other words, a pair of records with clear weather, between which there will be no periods of dense clouds. The uniformity of measurements is not implied.
That is, the task is reduced to finding a set of three or less Period_1 ... Period_N values ​​in descending order.

2. In the framework of the second task, we have a table similar to the first 'Wind' table, which includes records of the wind force at particular points in time. It has two fields:
• Time
• Speed
It is required to find all the “local maxima” of the velocity — that is, the instants of time and velocity values, in comparison with which the preceding and subsequent (in time) records have a lower velocity.

On the graph, local maxima correspond to 3, 10, 14, 17. For simplicity, we will not consider the boundary point 19.
I suppose it is clear that the “direct” use of aggregate functions is indispensable: you need to state in some way that the points follow in succession. To solve both problems, we use the tricks:
1. Implicit Join a table with the same one: in the FROM field, separate the records and our table (for example, FROM Weather w1, Weather w2), separated by commas.
Just in case: when selecting SELECT w1. *, W2. * FROM Weather w1, Weather w2; All pairs of records will be displayed, including pairs that match and reverse in the reverse order. That is, with 10 entries in the table will be 100.
2. The exists () function, inside which we write a subquery.
Both of these techniques are shown in the free SQL course (video):
here and
here .
A crude solution to the first problem: select all pairs of records with bad weather, between which all records with good weather, and not less than 1 (that is, there are no bad pods in the intermediate records):
SELECT (w2.time - w1.time) as duration FROM Weather w1, Weather w2 WHERE w2.time > w1.time
Nuances: in different DBMS you may need:
• conversions for the timestamp type so that w1.time and w2.time can be subtracted;
• limit on the number of output lines - limit or top.
Problems with this solution:
• Observation period boundaries: if all the records in the table are only with fair weather, we will not receive a response. As, however, we will not receive clear weather periods at the beginning and end of observations;
• The answer does not contain accurate data on the first and final recording with good weather.
Strictly speaking, such a solution does not meet the condition of the problem (“a pair of records with clear weather, between which there is no other ...”).
Both problems are treated by the same methods.
We perfect the solution: we are looking for pairs of records with good weather, between which there are no records with bad weather. And so that the first entry in the pair would not have a previous entry with good weather (i.e., it would be a entry that starts a clear period). Accordingly, the second entry in the pair should not have a subsequent entry with good weather (i.e., the recording would close the clear period).
I suppose enough query ideas.
Solution of the second problem: we are looking for triples of consecutive records in which the wind speed at the midpoint is greater than in each of the neighboring ones. The neighborhood is checked by the exists function as in the first task: not exists (...). The order of the points is checked by comparing the time values.
This concludes the article and once again recommends the
Stanford course !
Enjoy!