⬆️ ⬇️

Generating a sequence of dates and generate_series in PostgreSQL

Bike warning

This article may be a spherical example of cycling. If you know the standard or more elegant solution to the problem, I will be glad to see it in the comments.



Once on one of the projects we needed to make a report on financial operations for the period with grouping subtotals at the end of the month.



The task is, in general, simple, to determine the required periods within a large interval, to bind each operation to a suitable period, to group and add up the amount.



To generate periods within an interval, I habitually took the function generate_series, which I often use to generate numeric sequences. I checked the documentation about the possibility of generating a sequence of dates, looked at an example, wrote a request and was puzzled.



select gs::date from generate_series('2018-01-31', '2018-05-31', interval '1 month') as gs; 


gs
01.31.2018
02.28.2018
03.28.2018
04/28/2018
05/28/2018


The result was as unexpected as logical. The generate_series function honestly and iteratively generated a sequence of dates according to the principle of successively adding a shift to the previous value. At the same time, at each step, the correctness and correction of the received date was checked. February 31 does not happen, so the date was transformed into February 28, and a further addition of the month brought down the entire sequence by the 28th.



UPD. Explanations after questions in the comments. In general, the original task is wider - to group data on arbitrary days of the month. For example, grouped by the 20th day of each month, by the 15th day, but with such dates there are no problems with the generation. The mechanism that we are looking for should equally well build a sequence of 10 numbers of each month, 21 numbers and correctly work out the ends of the months.



I wonder how the addition operation will behave with several months at once? What will happen if we add an interval not iteratively, but in bulk?



 select '2018-01-31'::date +interval '1 mons' 28.02.2018 select '2018-01-31'::date +interval '2 mons' 31.03.2018 


In this case, the addition is made honestly.

How to use this approach to generate the necessary dates?



If the number of months is known, it is very simple:



 select '2018-01-31'::date +make_interval(0, i) as gs from generate_series(0, 4, 1) as i 


gs
01.31.2018
02.28.2018
03/31/2018
04/30/2018
05.31.2018


What to do if only the start date and end date are known?

This task can be solved quite simply by writing a stored function and a simple cycle in it, but we are interested in the implementation option when there is no possibility or desire to clutter the database structure with unnecessary objects.

Let's try to reduce the task to the previous one.



The following code is to some extent a mock-up board and does not pretend to elegance; we write the first variants of requests in the company with an emphasis on the flexibility and interchangeability of blocks.



 /*  -  ,         ,      */ with dates as ( select '2018-01-31'::date as dt1, '2018-05-31'::date as dt2 ), /*      ""  */ g_age as ( select age( (select dt2 from dates), (select dt1 from dates)) ), /*       (*12 + )   +1       */ months as ( select (extract(year from (select * from g_age))*12 + extract(month from (select * from g_age))+1)::integer ), /*  ,           -   ,       */ seq as( select ((select dt1 from dates) + make_interval(0, gs)) as gs from generate_series ( 0, (select * from months), 1 ) as gs where ((select dt1 from dates) + make_interval(0, gs)) <= (select dt2 from dates) ) /*         */ select * from seq 


gs
01.31.2018
02.28.2018
03/31/2018
04/30/2018
05.31.2018


The solution turned out to be quite cumbersome, but it is sufficiently simple for the worker and him to integrate into other requests through the with mechanism.

We have implemented the report, but the idea that this request is not only cumbersome, it is also limited in its use only in steps for whole months did not give rest.



Option 2.

After a while, it dawned on me that the sequential generation of dates is essentially a recursive procedure. Only not in its pure form, since in our case the calculation of the next date from the previous one leads to the initial problem. But at each step we can increase the interval added to the beginning of our period:



 /*    -,     timestamp */ with recursive dates as ( select '2018-01-31'::timestamp as dt1, '2018-05-31'::timestamp as dt2, interval '1 month' as interval ), /*           ,          ,   .  ,        */ pr AS( select 1 as i, (select dt1 from dates) as dt union select i+1 as i, ( (select dt1 from dates) + ( select interval from dates)*i)::timestamp as dt from pr where ( ((select dt1 from dates) + (select interval from dates)*i)::timestamp) <=(select dt2 from dates) ) select dt as gs from pr; 


gs
01.31.2018
02.28.2018
03/31/2018
04/30/2018
05.31.2018


This query works correctly with any input time intervals and intervals.



')

Source: https://habr.com/ru/post/421969/



All Articles