📜 ⬆️ ⬇️

Experiments on the Olympiad problem

It so happened that I got into the magistracy, and as it was walking past the department, the 1C Olympiad challenge came across. Briefly, the task is: “There are sales records for each day, you need to find the largest period when the plan was executed.” And then when I was walking with a sleeping daughter, I had a question, and how many ways it can be done in SQL. Solutions will be based on MS SQL.

Create a table and begin

Text SQL script to create a table
CREATE TABLE [tmp].[forFindDate]( [date] [datetime] NOT NULL, [value] [int] NOT NULL, CONSTRAINT [PK_forFindDate] PRIMARY KEY CLUSTERED ( [date] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170401', 10) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170402', 20) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170403', 20) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170404', 20) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170405', 10) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170406', 10) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170407', 30) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170408', 36) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170409', 35) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170410', 30) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170411', 30) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170412', 20) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170413', 10) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170414', 40) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170415', 40) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170416', 10) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170417', 50) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170418', 52) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170419', 53) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170420', 53) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170421', 50) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170422', 51) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170423', 52) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170424', 50) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170425', 50) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170426', 50) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170427', 10) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170428', 10) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170429', 10) INSERT INTO [tmp].[forFindDate] ([date], [value]) VALUES ('20170430', 10) GO 

The first way is through joining oneself (via join).

Join SQL queries
 declare @planValue int = 15 --  left select top 1 '  ', BeginDate, DATEADD(Day, -1, EndDate) EndDate from (select s.date BeginDate, ISNULL(min(e.date), '20170501') EndDate from [tmp].[forFindDate] s left join [tmp].[forFindDate] e on s.date < e.date and not e.value > @planValue where s.value > @planValue group by s.date) tmp order by DATEDIFF(day, BeginDate, EndDate) desc --       with ;with periodDate1 as( select s.date BeginDate, ISNULL(min(e.date), '20170501') EndDate from [tmp].[forFindDate] s left join [tmp].[forFindDate] e on s.date < e.date and not e.value > @planValue where s.value > @planValue group by s.date ) select top 1 '   (with left)' title, BeginDate, DATEADD(Day, -1, EndDate) EndDate from periodDate1 order by DATEDIFF(day, BeginDate, EndDate) desc --  full ;with periodDate2 as( select s.date BeginDate, ISNULL(min(e.date), '20170501') EndDate from [tmp].[forFindDate] s full join [tmp].[forFindDate] e on s.date < e.date and not e.value > @planValue where s.value > @planValue group by s.date ) select top 1 '   (with full)' title, BeginDate, DATEADD(Day, -1, EndDate) EndDate from periodDate2 order by DATEDIFF(day, BeginDate, EndDate) desc --  right ;with periodDate3 as( select ISNULL(max(s.date), '20170401') BeginDate, e.date EndDate from [tmp].[forFindDate] s right join [tmp].[forFindDate] e on s.date < e.date and not s.value > @planValue where e.value > @planValue group by e.date ) select top 1 '   (with right)' title, DATEADD(Day, 1, BeginDate) BeginDate, EndDate from periodDate3 order by DATEDIFF(day, BeginDate, EndDate) desc 

And for any join, we get exactly the same plan (via NESTED LOOP (left outer join)).
')
Join execution plans





The second way through the correlation query

SQL queries via correlation query
 declare @planValue int = 15 ;with periodDate4 as( select s.date BeginDate, ISNULL((select DATEADD(DAY, -1, isNull(min(d.date), '20170501')) from [tmp].[forFindDate] d where d.date > s.date and not d.value > @planValue), '20170501') EndDate from [tmp].[forFindDate] s where s.value > @planValue ) select top 1 ' ', p.BeginDate, p.EndDate from periodDate4 p order by DATEDIFF(DAY, p.BeginDate, p.EndDate) desc 

In this case, we get the NESTED LOOP (inner join)).

Correlation Query Execution Plans


The third way is still not interesting, now let's take the function APPLY (appeared in MS SQL 2005). In fact, for each line we will do on request.

SQL queries via outer apply
 declare @planValue int = 15 ;with periodDate5 as( select s.date BeginDate, ISNULL(min(e.date), '20170501') EndDate from [tmp].[forFindDate] s outer apply ( --   (  ) select top 1 ee.date from [tmp].[forFindDate] ee where s.date < ee.date and not ee.value > @planValue ) e where s.value > @planValue group by s.date ) select top 1 'sql 2005 apply' title, BeginDate, DATEADD(Day, -1, EndDate) EndDate from periodDate5 order by DATEDIFF(day, BeginDate, EndDate) desc 

In this case, we again get NESTED LOOP (left outer join), but this method is the most optimal at the moment (at least MS SQL thinks so).

Apply execution plans


So far everything was boring and mundane.

The fourth way to play with recursion: to the current line we will glue the data if the date is one day older and the sales plan is executed. Thus we will expand the interval.

Since there is not much data, the sequence length is short, as is the call stack depth (CTE recursive is possible from 2005).

SQL queries recursion
 declare @planValue int = 15 ;with periodDate6 as ( select f.date BeginDate, DATEADD(day, 1, f.date) EndDate from [tmp].[forFindDate] f where f.date = '20170417' and f.value > @planValue union all select r.BeginDate, DATEADD(day, 1, f.date) EndDate from [tmp].[forFindDate] f inner join periodDate6 r on r.EndDate = f.Date and value > @planValue ) select top 1 'recursive' title, BeginDate, DATEADD(Day, -1, EndDate) EndDate from periodDate6 order by DATEDIFF(day, BeginDate, EndDate) desc 

Correlation Query Execution Plans


The fifth way, we turn to the possibilities of MS SQL 2012 to the analytical functions LEAD (window functions). The LEAD function returns the following values ​​with a shift within the sections for sorting. We have two sections in the data: fulfillment and non-fulfillment of the plan, sorting is needed by dates, in order to find the maximum period we will act slyly: we will look for gaps in non-fulfillment of the plan, i.e. then when we move the dates of the beginning forward, and the end back, we get the segments of the implementation of the plan. In this case, we are interested only in the section not fulfilling its plan and take it, but in general, the division can be done
 OVER ( PARTITION BY iif(f.value < @planValue, 1, 0) order by f.date) 

LEAD SQL queries
 declare @planValue int = 15 with tmp as ( select '20170331' [date] union all select date from [tmp].[forFindDate] f where f.value < @planValue union all select '20170501' ), periodDate4 as ( select date beforDate, lead(f.date, 1, '20170501') OVER (order by f.date) afterDate from tmp f ) select top 1 'func 2012 t' title, DATEADD(Day, 1, beforDate) BeginDate, DATEADD(Day, -1, afterDate) EndDate from periodDate4 order by DATEDIFF(day, beforDate, afterDate) desc 

The execution plan shows that the tables do not join, with the exception of adding dates outside the segment for correct work with the ends of the segments.

LEAD execution plans
<img src = "

The sixth method And now let's pamper ourselves by multiplying the table by ourselves: we will find all possible combinations of dates (30 * 30 = 900), select those whose start date is less than the end date and there is no plan execution event in this interval.

SQL queries recursion
 declare @planValue int = 15 ;with periodDate8 as( select ISNULL(s.date, '20170401') BeginDate, ISNULL(e.date, '20170501') EndDate from [tmp].[forFindDate] s, [tmp].[forFindDate] e ) select top 1 'for Fun', p.BeginDate, p.EndDate from periodDate8 p left join [tmp].[forFindDate] b on b.date between p.BeginDate and p.EndDate and not b.value > @planValue where p.BeginDate < p.EndDate and b.date is null order by DATEDIFF(day, BeginDate, EndDate) desc 

The slowest and hardest way. But we do not pursue productivity, but for the number of solutions to the problem.

Table Multiplication Execution Plans


Seventh way Normal cursor. We iterate over the data for which the plan has been executed, and we are looking for the maximum sequential segment.

SQL queries with the cursor
 declare @planValue int = 15 declare @endDate datetime = null, @Intervat int = 0, @maxIntervat int = 0, @old_date datetime = '20170401', @cur_date datetime DECLARE date_cursor CURSOR FOR SELECT date FROM [tmp].[forFindDate] where value > @planValue OPEN date_cursor FETCH NEXT FROM date_cursor INTO @cur_date WHILE @@FETCH_STATUS = 0 BEGIN IF (@cur_date = DATEADD(DAY, 1, @old_date)) BEGIN SET @Intervat = @Intervat + 1 IF(@Intervat > @maxIntervat) BEGIN SET @maxIntervat = @Intervat SET @endDate = @cur_date END END else Begin set @Intervat = 0 end SET @old_date = @cur_date FETCH NEXT FROM date_cursor INTO @cur_date END CLOSE date_cursor; DEALLOCATE date_cursor; select 'cursor' title, DATEADD(DAY, - @maxIntervat, @endDate) BeginDate, @endDate EndDate 

So far this is all I could think of. Waiting for more options in the comments.

UPD1:: Option from stotsky71@outlook.com. Through windowed functions we will find transitions, between states it is executed / not implementation of the plan. And then the selection of intervals. There may be problems due to the fact that the interval began with the fulfillment of the plan or the fulfillment of the plan ended, then the transition will not occur and the data will not be selected correctly.

LEAD and LAG SQL queries
 declare @planValue int = 15 ;WITH Step01 AS (SELECT * FROM tmp.forFindDate WHERE value > @planValue ) , Step02 AS (SELECT date , CASE WHEN date - LAG([date], 1, '20170401') OVER(ORDER BY date) > 1 THEN 1 ELSE 0 END AS isStart , CASE WHEN LEAD([date], 1, '20170430') OVER(ORDER BY date) - date > 1 THEN 1 ELSE 0 END AS isEnd FROM Step01) , Step03 AS ( SELECT date AS rangestart , CASE WHEN isEnd = 1 THEN date ELSE LEAD(date, 1) OVER(ORDER BY date) END AS rangeend , isstart FROM Step02 WHERE isstart = 1 OR isend = 1) SELECT TOP 1 rangestart , rangeend , DATEDIFF(day, rangestart, rangeend) AS '' FROM Step03 WHERE isstart = 1 ORDER BY '' DESC 


A variant from the letters helped to optimize my LEAD request (got rid of PARTITION BY in the request)

UPD2: Option from the correspondence. From the definition of intervals on the shift of dates: successive dates have the same increment by day, like a normal increment in rows. On this property, you can get some kind of offset and if the offset for several dates coincides with the growth of the increment, then these dates go sequentially.
SQL ROW_NUMBER
 declare @planValue int = 15 ;with internals AS ( SELECT dateadd(day, -ROW_NUMBER() OVER (ORDER BY date), [date]) fake_start, [date] FROM tmp.forFindDate where value > 15 ) SELECT TOP 1 'new method' as title, MIN([date]) AS BeginDate, MAX([date]) AS EndDate FROM internals GROUP BY fake_start ORDER BY COUNT(*) DESC 

Source: https://habr.com/ru/post/327862/


All Articles