📜 ⬆️ ⬇️

Table as parameter in postgresql

It is often seen complaints that the parameters "do not work." How do they not work?

That's how:

select * from $1 where ...; 

And the truth does not work - the table should be known to the server at the time of preparing the request. So - it turns out, absolutely impossible to pass a table as a parameter? And if not as a parameter? And if in my one table the values ​​of the columns are other tables - what to do? Not everything is so scary - you can use functions. In fact, let's create a simple and straightforward function that will execute the query and return the result to us:
')
 create or replace function doSelect(query text) returns setof record as $code$ begin return query execute query; end; $code$ language plpgsql 

In this case, we are not transmitting the name of the table, but the request immediately. And indeed, we may not need all the lines from it, but only a part - so why not force the server to immediately select all the lines we need? Indeed, when sampling from a function, the server will have to filter all return values, selecting only the necessary ones, and in the case of a query it may use, for example, indexes, not to mention that the query may be more complicated than just sampling the entire table.

Check:

 work=# select * from doSelect('select relname::text from pg_class') as ds(name text) where name='aa'; name ------ aa (1 ) 

Works.

In principle, you can do more with the parameters:

 create or replace function doSelect(query text, p1 text) returns setof record as $code$ begin return query execute query using p1; end; $code$ language plpgsql; create or replace function doSelect(query text, p1 text, p2 text) returns setof record as $code$ begin return query execute query using p1, p2; end; $code$ language plpgsql; 

… and so on.
Despite some clumsiness (in execute ... using you cannot pass an array of parameters; arrays are sets of elements of the same type, and parameters, generally speaking, can have different), this all works fine:

 work=# select * from doSelect(format('select table_catalog::text, table_schema::text, table_name::text from %s where table_schema=$1 limit $2::bigint', 'information_schema.tables'), 'public',1::text) as i(table_catalog text, table_schema text, table_name text); table_catalog | table_schema | table_name ---------------+--------------+------------ work | public | aa (1 ) 

In order not to create many similar functions, you can simply create a function with a variable number of parameters:

 create or replace function doSelect(query text, variadic param text[]) returns setof record as $code$ begin return query execute query using param; end; $code$ language plpgsql 

True, when forming the query string, you will have to use a not quite convenient syntax - $ 1 [N], where N is the number of the parameter passed. So, for the query above it turns out:

 work=# select * from doSelect(format('select table_catalog::text, table_schema::text, table_name::text from %s where table_schema=$1[1] limit $1[2]::bigint', 'information_schema.tables'), 'public',1::text) as i(table_catalog text, table_schema text, table_name text); table_catalog | table_schema | table_name ---------------+--------------+------------ work | public | aa (1 ) 

The essence of the problem lies in the fact that in Postgres it is impossible to have arrays of different types of elements - and in the functions above everything is reduced, as you can see, to the text type, which requires explicit conversions to the required types in the body of the query (curiously, limit requires the bigint type) . Nevertheless, it is all workable. What is most interesting, these functions can be used not only with parameters passed from the application - they can also be used with columns from another table, for example:

 work=# select table_name, cnt.cnt from information_schema.tables t, doSelect(format('select count(*) from %s', t.table_name)) as cnt(cnt bigint) where table_schema='public'; table_name | cnt ---------------+---------- aa | 10000000 ttq | 3 report | 12 colltest | 100000 t2 | 100000 tpair | 10000 call | 0 XXXXXXX_locks | 1 t | 2 sbr | 273370 stest | 954 house | 21000 ttn | 1000000 addrobj | 21000 tt1 | 1 tt2 | 1 ttt | 100000 tt | 1 t1 | 10000 (19 ) 

I would like to draw attention to the fact that in the function call we refer to the column from the table located to the left in the list of tables in from.

So the table as a parameter can be used quite well; However, it is worth paying attention to the fact that during the execution of a request, a separate plan for a dynamic query will be built for each function execution, which, of course, requires certain resources, although, on the other hand, it can often turn out to be a completely desirable side effect.
How much will productivity with such an approach? Oddly enough, at least in simple cases, very slightly:

 create table tableasparameter as select n from generate_series(1,1000) as gs(n); work=# explain analyze select tt.* from generate_series(1,10000), tableasparameter tt; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..12526.50 rows=1000000 width=4) (actual time=1.919..1347.240 rows=10000000 loops=1) -> Function Scan on generate_series (cost=0.00..10.00 rows=1000 width=0) (actual time=1.896..2.508 rows=10000 loops=1) -> Materialize (cost=0.00..19.00 rows=1000 width=4) (actual time=0.000..0.042 rows=1000 loops=10000) -> Seq Scan on tableasparameter tt (cost=0.00..14.00 rows=1000 width=4) (actual time=0.017..0.219 rows=1000 loops=1) Planning time: 0.068 ms Execution time: 1648.586 ms (6 ) work=# explain analyze work-# select tt.* from generate_series(1,10000), doSelect('select * from tableasparameter') as tt(val int); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.25..20010.25 rows=1000000 width=4) (actual time=1.294..1401.768 rows=10000000 loops=1) -> Function Scan on generate_series (cost=0.00..10.00 rows=1000 width=0) (actual time=1.033..1.590 rows=10000 loops=1) -> Function Scan on doselect tt (cost=0.25..10.25 rows=1000 width=4) (actual time=0.000..0.047 rows=1000 loops=10000) Planning time: 0.039 ms Execution time: 1705.056 ms (5 ) 

As you can see, there is a loss in performance, but, in general, not very significant.

Source: https://habr.com/ru/post/278019/


All Articles