Recursive (hierarchical) queries in PostgreSQL

Following Oraklom with its 'connet by prior', all other DBMS introduce their own implementation of hierarchical queries (IZ). I would like to tell a wide audience how it is done in PostgreSQL.

Story

It all started like this

From: Evgen Potemkin <evgent@ns.terminal.ru>
Subject: Proposal of hierachical queries (a la Oracle)
Date: 2002-11-14 11:52:28 GMT
Hi there!
I want to order the hierarchical queries
posibility. It allows you to construct queries for Oracle for ex:
SELECT a, b FROM t CONNECT BY a PRIOR b START WITH cond; B
I've seen this type of query often made by adding a new type, which
stores position of row in the tree. But sorting such tree are very
tricky (i think).
Patch allows you to sorted, ie, subnodes of each node
be sorted by ORDER BY clause.
with regards, evgen

then

From: Tatsuo Ishii <ishii@postgresql.org>
Subject: RFP: Recursive query in 8.4
Date: 2008-02-19 08:36:00 GMT (1 year, 12 weeks, 6 days ago)
Hi,
I’ve promised
recursive query as defined in PostgreSQL 8.4.
Sumitomo Electric Information Systems Co., Inc.
Ltd. (http://www.sei-info.co.jp/) and SRA OSS, Inc. Japan
(http://www.sraoss.co.jp).

Well And since version 8.4 Postgresql began to support the recursive request.

Theory

')
FROM in PostgreSQL is implemented on the basis of standard SQL clause WITH. Let's get a little deeper: non-recursive WITH allows you to save on repeated subqueries, split a complex query into several smaller ones, is a convenient shortcut for referring to a subquery and is convenient in terms of saving time when writing code. as in the example below, it was possible to avoid using the subquery in WHERE by applying a temporary table top_regions formed specifically for this query.

WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .
WITH regional_sales AS ( SELECT region, SUM (amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > ( SELECT SUM (total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM (quantity) AS product_units, SUM (amount) AS product_sales FROM orders WHERE region IN ( SELECT region FROM top_regions) GROUP BY region, product; * This source code was highlighted with Source Code Highlighter .

Adding an optional RECURSEVE statement allows a Postgre query to access its output. The query algorithm must consist of other parts. The first part is the base, usually returning one line from the starting point of the hierarchy or part of the hierarchy. Ie a place in the hierarchy where the countdown will start (for example, the root). and the second recursive part that will be associated with the temporary table we declared after WITH. The first and second parts are combined with the UNION or UNION ALL operator. so that we would set an example for this chaotic set of words.
Create a table that describes the structru of one company

CREATE TABLE KPO
(
"ID" character varying(55),
"DESCRIPTION" character varying(255),
"PARENT" character varying(55
) ;

after entering the data there:
Select * from kpo

ID	DESCRIPTION	PARENT
== ======	================================	=======
KPO	KARACHAGANAK PETROLEUM OPERATING	{null}
AKSAY	AKSAY	KPO
URALSK	KPO	KPO
LONDON	LONDON	KPO
KPC	KPC	AKSAY
U2	UNIT-2	AKSAY
U3	UNIT-3	AKSAY
PROD	PRODACTION	KPC
MAINT	MAINTENANCE	AKSAY
CMMS	CMMS TEAM	MAINT

Now the recursive query itself:

WITH RECURSIVE temp1 ("ID", "PARENT", "DESCRIPTION", PATH , LEVEL ) AS (
SELECT T1. "ID", T1. "PARENT", T1. "DESCRIPTION", CAST (T1. "ID" AS VARCHAR (50)) as PATH , 1
FROM KPO T1 WHERE T1. "PARENT" IS NULL
union
select T2. "ID", T2. "PARENT", T2. "DESCRIPTION", CAST (temp1. PATH || '->' || T2. "ID" AS VARCHAR (50)), LEVEL + 1
FROM KPO T2 INNER JOIN temp1 ON (temp1. "ID" = T2. "PARENT"))
select * from temp1 ORDER BY PATH LIMIT 100
* This source code was highlighted with Source Code Highlighter .

The first part (rows 2-3) returns to the temporary table the first row in this case, the root entry of our structure, from which the countdown will begin in our hierarchy. the second part (lines 4-5) adds to the same temporary table the records associated with the string already contained in temp1 via JOIN (ID = PARENT) and so on until the end until all the leaves of our ROOTa are in temp1.
Also in this example, the Orakolavskaya function sys_connect_by_path was simulated.

"ID"	"PARENT"	"DESCRIPTION"	"Path"	"Level"
KPO		KARACHAGANAK PETROLEUM OPERATING	KPO	one
AKSAY	KPO	AKSAY	KPO-> AKSAY	2
KPC	AKSAY	KPC	KPO-> AKSAY-> KPC	3
PROD	KPC	PRODAUCTION	KPO-> AKSAY-> KPC-> PROD	four
MAINT	AKSAY	MAINTENANCE	KPO-> AKSAY-> MAINT	3
CMMS	MAINT	CMMS TEAM	KPO-> AKSAY-> MAINT-> CMMS	four
U2	AKSAY	UNIT-2	KPO-> AKSAY-> U2	3
U3	AKSAY	UNIT-3	KPO-> AKSAY-> U3	3
LONDON	KPO	LONDON	KPO-> LONDON	2
URALSK	KPO	URALSK	KPO-> URALSK	2

Postgre does not have a built-in looping test, so if we received the data from the guys who were directly involved in creating the structure in Excel, we should check this structure for integrity. sometimes it is enough to use UNION instead of UNION ALL, but this is only sometimes. If you set the starting point in the hierarchy in the first part, and if there is even a break in the hierarchy, in principle, having started the above-mentioned error there will not be an error, just the lines “tweaks” will be ignored. But we need to know where the error is, and we can implement this by introducing an additional check before performing UNION.

WITH RECURSIVE temp1 ("ID", "PARENT", "DESCRIPTION", PATH , LEVEL , cycle ) AS (
SELECT T1. "ID", T1. "PARENT", T1. "DESCRIPTION", cast ( array [T1. "ID"] as varchar (100) []), 1, FALSE
FROM KPO T1
union all
select T2. "ID", T2. "PARENT", T2. "DESCRIPTION", cast (temp1. PATH || T2. "ID" as varchar (100) []), LEVEL + 1,
T2. "ID" = any (temp1. PATH )
FROM KPO T2 INNER JOIN temp1 ON (temp1. "ID" = T2. "PARENT") AND NOT CYCLE )
select * from temp1 WHERE CYCLE IS TRUE LIMIT 100;
* This source code was highlighted with Source Code Highlighter .

Here, as you see, the same Path field is created but all previous parents are organized in an array, which makes it possible for us to compare each new “ID” to a duplicate. we don’t use this line to search for descendants anymore, so looping is avoided (union all ... WHERE ... AND NOT CYCLE).

Advice from the official site: use the LIMIT operator as I did in the examples. This will help you keep your nerves.

Practical examples

Horsch's theory of course, but where all this can be applied in practice, tebolle in light of the cost of such requests.
In addition, it would be nice to draw a hierarchy in one query, and to find for example all the leaves of the current “ID” there are also other tasks.
Often you need to make changes, such as for example changing the phone code to all employees of a particular department in the telephone directory. Of course, you can use the column in the table of workers or even prefix the “ID”. but all this makes the base non-flexible and non-scalable. It is much better to make a separate Workers table that will display a hierarchy with three columns “ID”, “PARENT“, “HIERARCHID”. the “HIERARCHID” field will allow you to make more than one hierarchical structure. The table is called for an example ANCESTOR which will also consist of three columns “ID”, “ANCESTOR”, “HIERARCHID”. in the “ANCESTOR” field all ancestors will be contained, remember the “Path” array from Example 3. So this table can be filled with a recursive query.

insert into ancestor (“ID” "ANCESTOR", "HIERARCHID")
WITH RECURSIVE temp1 ("ID", "PARENT", "DESCRIPTION", PATH , LEVEL , cycle ) AS (
SELECT T1. "ID", T1. "PARENT", T1. "DESCRIPTION", cast ( array [T1. "ID"] as varchar (100) []), 1, FALSE
FROM KPO T1
union all
select T2. "ID", T2. "PARENT", T2. "DESCRIPTION", cast (temp1. PATH || T2. "ID" as varchar (100) []), LEVEL + 1,
T2. "ID" = any (temp1. PATH )
FROM KPO T2 INNER JOIN temp1 ON (temp1. "ID" = T2. "PARENT") AND NOT CYCLE
)
select "ID" AS "LOCATION", PATH [1] AS "ANCESTOR", 'DEPT' AS "DID" from temp1
* This source code was highlighted with Source Code Highlighter .

such a sign will turn out

LOCATION	ANCESTOR	HIERARCHID
=========	=========	==========
AKSAY	AKSAY	DEPT
AKSAY	KPO	DEPT
CMMS	KPO	DEPT
CMMS	MAINT	DEPT
CMMS	CMMS	DEPT
CMMS	AKSAY	DEPT
KPC	AKSAY	DEPT
KPC	KPC	DEPT
KPC	KPO	DEPT
KPO	KPO	DEPT
LONDON	LONDON	DEPT
LONDON	KPO	DEPT
MAINT	AKSAY	DEPT
MAINT	MAINT	DEPT
MAINT	KPO	DEPT
PROD	KPO	DEPT
PROD	AKSAY	DEPT
PROD	KPC	DEPT
PROD	PROD	DEPT
U2	AKSAY	DEPT
U2	KPO	DEPT
U2	U2	DEPT
U3	U3	DEPT
U3	AKSAY	DEPT
U3	KPO	DEPT
URALSK	KPO	DEPT
URALSK	URALSK	DEPT

Now all our selects and updates will be done using this table. for example, with the same phone numbers, it would look something like this:

Update EMPLOYEE SET “TELNUM” = '545-454' FROM ANCESTOR WHERE “ANCESTOR”.”ANCESTOR” = 'AKSAY' AND EMPLOYEE.”LOCATION” = ANCESTOR.”LOCATION” AND EMPLOYEE.ISDPRT = 'N' ;

Moreover, it will be necessary to foresee a trigger for updating the table when adding a new one or deleting an entry in the KPO table.

Continue

Suppose that there is a table in which logs are recorded, imagine that you need to count how many records for the previous month were made and grouped by day. for this you will need a reference calendar for the past month, for example, so as not to miss the day when there were no records. such a calendar (list of days in one column) can be obtained by such a request.

WITH RECURSIVE t (n) AS (
VALUES (1)
UNION ALL
SELECT n + 1 FROM t WHERE n <( SELECT cast ( extract ( 'day' from date_trunc ( 'month' , current_date ) - interval '24 hour ' ) as integer ))
)
SELECT to_date (n || '-' || extract ( 'month' from date_trunc ( 'month' , current_date ) - interval '24 hour ' )
|| '-' || extract ( 'year' from date_trunc ( 'month' , current_date ) - interval '24 hour ' ), ' dd-mm-yyyy ' )
FROM t;
* This source code was highlighted with Source Code Highlighter .

Another not very useful example for a snack. The original idea of Graeme Joba (Graeme Job). the result is better to look at the text file

WITH RECURSIVE z (ix, iy, cx, cy, x, y, i) AS (
SELECT ix, iy, x :: float , y :: float , x :: float , y :: float , 0
FROM ( select -1.88 + 0.016 * i, i from generate_series (0,150) as i) as xgen (x, ix),
( select -1.11 + 0.060 * i, i from generate_series (0.36) as i) as ygen (y, iy)
UNION ALL
SELECT ix, iy, cx, cy, x * x - y * y + cx, y * x * 2 + cy, i + 1
FROM z
WHERE x * x + y * y <16 :: float
AND i <27
)
SELECT array_to_string (array_agg ( substring ( '. ,,, ----- ++++ %%%% @@@@ ####' ,
greatest (i, 1), 1)), '' )
FROM (
SELECT ix, iy, max (i) AS i
FROM z
GROUP BY iy, ix
ORDER BY iy, ix
) AS zt
GROUP BY iy
ORDER BY iy
* This source code was highlighted with Source Code Highlighter .

Well, such a post to start ;-)

Source: https://habr.com/ru/post/73700/

All Articles

Recursive (hierarchical) queries in PostgreSQL

Story

Theory

Practical examples

More articles: