📜 ⬆️ ⬇️

Recursive queries in PostgreSQL (WITH RECURSIVE)


Oddly enough, in order to understand recursion, PostgreSQL does not need to understand recursion. Because WITH RECURSIVE, which is present in the program (and in other serious databases), is rather a calculation of something by iteration before some condition is fulfilled.
Nevertheless, this is a very useful base functionality, which can be used, for example, to display all subcategories of a given category if the table is specified as (id, parent_id, ...)


Let's try to calculate factorial for simplicity. In javascript, we would do it like this:

function factorial(i) { if (i > 1) { return i * factorial(i-1); } else { return 1; } } console.log(factorial(10)); 

')
This is calculated completely differently. In the recursive part of the CTE, there must necessarily be a starting part and a recursive part, separated by the word UNION.

 WITH RECURSIVE r AS ( --    (.. "anchor") SELECT 1 AS i, 1 AS factorial UNION --   SELECT i+1 AS i, factorial * (i+1) as factorial FROM r WHERE i < 10 ) SELECT * FROM r; i | factorial ----+----------- 1 | 1 2 | 2 3 | 6 4 | 24 5 | 120 6 | 720 7 | 5040 8 | 40320 9 | 362880 10 | 3628800 (10 rows) 


Please note that if you read this query, so to speak, intuitively, as it is, then it seems that the letter should go into an eternal cycle and do not understand what to count.
The fact is that this FROM r does not execute the entire query again, but it works like this: for the first time it takes what is in the starting part of the recursion (anchor), and in the next iterations it takes the results of the previous iteration.

Algorithm like this:
  1. Take the starting data
  2. substitute in the "recursive" part of the query.
  3. look what happened:
    • if the exhaust of the recursive part is not empty, then we add it to the resulting sample, and also use this exhaust as data for the next call to the recursive part, i.e. goto 2
    • if empty, we complete processing

In general, the example with factorial is not very indicative, because postgresql already knows how to calculate factorials, even very large ones (and generally works well with large numbers):

 SELECT 10000! --    ,        


Take the promised selection of wood.

 CREATE TABLE geo ( id int not null primary key, parent_id int references geo(id), name varchar(1000) ); INSERT INTO geo (id, parent_id, name) VALUES (1, null, ' '), (2, 1, ' '), (3, 1, '  '), (4, 2, ''), (5, 4, ''), (6, 4, ''), (7, 5, ''), (8, 5, '-'), (9, 6, ''); 


Choose everything that relates to Europe:

 WITH RECURSIVE r AS ( SELECT id, parent_id, name FROM geo WHERE parent_id = 4 UNION SELECT id, parent_id FROM geo WHERE parent_id IN ( SELECT id FROM r ) ) SELECT * FROM r; ERROR: recursive reference to query "r" must not appear within a subquery 


Oops, the next constraint, recursion should not be used in subqueries.
Ok, rewrite to join:

 WITH RECURSIVE r AS ( SELECT id, parent_id, name FROM geo WHERE parent_id = 4 UNION SELECT geo.id, geo.parent_id, geo.name FROM geo JOIN r ON geo.parent_id = r.id ) SELECT * FROM r; id | parent_id | name ----+-----------+----------------- 5 | 4 |  6 | 4 |  7 | 5 |  8 | 5 | - 9 | 6 |  (5 rows) 


Another example. You can, for example, give everything that relates to Europe together with Europe itself, and also count the level of nesting

 WITH RECURSIVE r AS ( SELECT id, parent_id, name, 1 AS level FROM geo WHERE id = 4 UNION ALL SELECT geo.id, geo.parent_id, geo.name, r.level + 1 AS level FROM geo JOIN r ON geo.parent_id = r.id ) SELECT * FROM r; id | parent_id | name | level ----+-----------+-----------------+------- 4 | 2 |  | 1 5 | 4 |  | 2 6 | 4 |  | 2 7 | 5 |  | 3 8 | 5 | - | 3 9 | 6 |  | 3 (6 rows) 


If you know something interesting on this topic, please write in the comments. Where in practice have you successfully used recursion in SQL? What are the pitfalls?

Source: https://habr.com/ru/post/269497/


All Articles