📜 ⬆️ ⬇️

PostgreSQL: materialized views and FDW



You probably know that Postgres has materialized views and foreign data wrappers (FDW) . Materialized views allow you to materialize queries and update them on demand. Third-party data wrappers provide the functionality to load data from external sources, such as, for example, NoSQL storage or other Postgres servers.


It is likely that you have not yet considered using the materialized views together with wrappers for third-party data. Materialized views speed up data access: query results are saved and there is no need to perform them again. Access to third-party data through FDW can be quite slow, as they are on other systems. By combining these functions, you can eventually get quick access to third-party data.


Let's confirm this with practice! First, create a foreign table (foreign table):


CREATE DATABASE fdw_test; \connect fdw_test; CREATE TABLE world (greeting TEXT); \connect test CREATE EXTENSION postgres_fdw; CREATE SERVER postgres_fdw_test FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'fdw_test'); CREATE USER MAPPING FOR PUBLIC SERVER postgres_fdw_test OPTIONS (password ''); CREATE FOREIGN TABLE other_world (greeting TEXT) SERVER postgres_fdw_test OPTIONS (table_name 'world'); \det List of foreign tables Schema | Table | Server --------+-------------+------------------- public | other_world | postgres_fdw_test 

fill it with data:


 INSERT INTO other_world SELECT * FROM generate_series(1, 100000); 

and create a materialized view based on a third-party table:


 CREATE MATERIALIZED VIEW mat_view (first_letter, count) AS SELECT left(greeting, 1), COUNT(*) FROM other_world GROUP BY left(greeting, 1); 

Now we can compare the sampling time from third-party tables and materialized views:


 \timing SELECT left(greeting, 1) AS first_letter, COUNT(*) FROM other_world GROUP BY left(greeting, 1); first_letter | count --------------+------- 1 | 11112 2 | 11111 3 | 11111 4 | 11111 5 | 11111 6 | 11111 7 | 11111 8 | 11111 9 | 11111 Time: 354.571 ms SELECT * FROM mat_view; first_letter | count --------------+------- 1 | 11112 2 | 11111 3 | 11111 4 | 11111 5 | 11111 6 | 11111 7 | 11111 8 | 11111 9 | 11111 Time: 0.783 ms 

The materialized view turned out to be much faster, but not everything is so rosy, because its refresh takes almost as much time as sampling from a third-party table:


 REFRESH MATERIALIZED VIEW mat_view; Time: 364.889 ms 

The above commands were executed in Postgres 9.6. However, already in the tenth version there appeared such an improvement:


Perform aggregate functions on FDW servers whenever possible (Jeevan Chalke, Ashutosh Bapat).

Thanks to it, it is possible to reduce the amount of data transmitted from the FDW server, as well as to relieve the aggregation load from the requesting server. This optimization is implemented in a wrapper of third-party data postgres_fdw , which is also able to perform joins on third-party servers (using extensions).


In Postgres 10, aggregations of third-party tables are faster than in 9.6, but still slower than samples from materialized views:


 SELECT left(greeting, 1) AS first_letter, COUNT(*) FROM other_world GROUP BY left(greeting, 1); first_letter | count --------------+------- 1 | 11112 2 | 11111 3 | 11111 4 | 11111 5 | 11111 6 | 11111 7 | 11111 8 | 11111 9 | 11111 Time: 55.052 ms 

Using aggregates in materialized views is not necessary at all - you can simply copy the entire table and update the corresponding view as needed (but the logical replication in Postgres 10 is even better for this):


 CREATE MATERIALIZED VIEW mat_view2 AS SELECT * FROM other_world; 

Now we can compare the speed of the query to a third-party table and its local copy:


 \o /dev/null SELECT * FROM other_world; Time: 317.428 ms SELECT * FROM mat_view2; Time: 34.861 ms 

In conclusion, the materialized views and wrappers of third-party data work fine together. With the help of materialized views, you can create local copies (caches) of entire external tables or aggregated data (samples) from these tables. Refreshing such a cache is very simple: refresh materialized view . At the same time, improvements appeared in Postgres 10 that speed up queries with aggregate functions to third-party tables.


References:


  1. Original: Materialized Views and Foreign Data Wrappers .

')

Source: https://habr.com/ru/post/337632/


All Articles