
Release PostgreSQL 11 will not be soon, only in October. But feature freezes have already arrived, which means we know which features were included in this release, and we can test them by
collecting PostgreSQL from the master branch. Particularly noteworthy feature called
INCLUDE-indices .
The patch was originally written by
Anastasia Lubennikova , and then doped by
Alexander Korotkov and
Fyodor Sigaev . Pushing it into PostgreSQL took “only” something for about three years.
Let's try to figure out what kind of indices for such. To begin with we will create a tablet for experiences:
create table test (k serial primary key, v text, ts timestamp); insert into test (v, ts) select 'key_' || s , now() from generate_series(1, 10000) as s;
... and build on it the usual btree-index:
')
create index on test (v);
Take a look at the execution plan for the following query:
=# explain select v, ts from test where v > 'key_1337' and v < 'key_2337'; QUERY PLAN ----------------------------------------------------------------------------- Bitmap Heap Scan on test (cost=31.57..112.09 rows=1101 width=16) Recheck Cond: ((v > 'key_1337'::text) AND (v < 'key_2337'::text)) -> Bitmap Index Scan on test_v_idx (cost=0.00..31.29 rows=1101 width=0) Index Cond: ((v > 'key_1337'::text) AND (v < 'key_2337'::text)) (4 rows)
See what happens. Since the index is built on the v column, and in the query we select v and ts, PostgreSQL is forced to execute the query in two steps. First, it goes by the index and finds the rows that satisfy the condition. Then he has to go to the table to get ts.
The idea of ​​INCLUDE-indexes is to include all the data necessary for the execution of the query directly into the index (but not index them). Thus, the request becomes possible to execute in one index scan.
Let's check:
drop index test_v_idx; create index on test (v) include (ts); explain select v, ts from test where v > 'key_1337' and v < 'key_2337';
Result:
Index Only Scan using test_v_ts_idx on test (cost=0.29..46.30 rows=1101 width=16) Index Cond: ((v > 'key_1337'::text) AND (v < 'key_2337'::text)) (2 rows)
Due to the fact that now we do not go to the table, the query should work faster. However, it is worth noting that in practice everything depends on your data. Each case is unique, so I deliberately do not cite any
synthetic benchmarks here . It may turn out that on your volumes of data, index only scan with include indexes works as fast as in the case of regular indexes. And even the accumulated statistics tell PostgreSQL that the query is faster to make a heap scan. This can happen, for example, if your query selectivity is low.
Anyway, knowing about this feature is useful, and I am sincerely glad that it will appear in PostgreSQL 11.