
On April 8th, the
2018-03 commit fest ended . Those patches that are not fixed on it (and on 3 previous comit fests) will no longer get into the release of PostgreSQL 11: feature freeze has occurred. Time to sum up.
The main news of the last komitfest (and version 11, respectively):
- weighty patch set for partitioning.
- JIT compilation is dedicated to only one patch, but this is a step in a direction that will most likely develop intensively in the future.
- "Covering" indices (INCLUDE-indices). This topic is already actively discussed and continues in the development.
- A series of patches in a group of procedural languages. They are also important for compatibility with SQL standards and migration from Oracle.
- Interesting, but not so resonant patches.
Let's start in random order.
')
JIT compilation
JIT compiling expressions & tuple deformingWithout JIT, executing a query is an interpretation of the execution plan. With JIT, that is, with just-in-time compilation, or on the fly, individual parts of the query are compiled, and thus the query is executed faster. A typical example is JIT compilation of expressions in SQL queries. In this
patch , the author of which
Andres Freund from EnterpriseDB, there is also a psychological component: for JIT, the
LLVM compiler is now involved - it is even called “compiler infrastructure”. It is often used specifically for JIT, it is convenient, allows you to embed functions in the code (inline), optimizes the code, and it is quite versatile from the point of view of the target platforms. When deforming records (expanding lines in memory), LLVM is also used, and this also improves performance.
INCLUDE indexes
Covering B-tree indexes (aka INCLUDE)INCLUDE-indices are a big
patch of Russian walkthrough,
Anastasia Lubennikova began to tear it up 2 years ago and
Alexander Korotkov and Fedor Sigaev continued (all of them from Postgres Professional - yes, we are not indifferent to the domestic contribution to the community). INCLUDE indexes are sometimes called covering indexes, but, strictly speaking, the covering index for a particular query is an index that contains all the columns of the table that are needed in the query. And the meaning of INCLUDE-indexes is that the index can become covering not by adding additional columns to it, but by storing additional information (non-indexed values). For example, in this way you can extend a unique index, which will remain unique.
CREATE UNIQUE INDEX newidx ON newt USING btree (c1, c2) INCLUDING (c3, c4);
They allow you to increase performance, since index only scan is usually much faster, and compared to other methods, go to index only scan, they are less cumbersome: you can do with one index where you need 2 or more, which means less time and resources are spent on update indexes when inserting and updating. The B-tree indexes Covering Patch (aka INCLUDE) is also the first step because support for covering indexes for other types will follow: for example, work has been started on supporting GiST.
Examples are in the
blog of Alexander Alexeev (Postgres Pro) and the
blog of Alexey Lesovsky (Data Egret).
New in procedural languages ​​(on Icentraut)
SQL procedures,Transaction control in procedures,PL / pgSQL nested CALL with transactions,SET TRANSACTION in PL / pgSQL ,
INOUT parameters in proceduresThese patches of authorship of
Peter Izentraut from 2ndQuadrant make the procedural languages ​​of PostgreSQL procedural in the literal sense: in addition to the stored functions, there will now be full-fledged stored procedures. Patch
SQL procedures adopted at the end of last year. Since then, the interpreter has understood the syntax with the CREATE / ALTER / DROP PROCEDURE commands, the call to the CALL procedure, and also the ROUTINE. In January, the most valuable thing was added - transaction management in procedures:
Transaction control in procedures. But this
patch INOUT
allows you to create procedures in this way:
CREATE PROCEDURE foo(INOUT a int) LANGUAGE plpgsql AS $$ begin SELECT 1 into a; end;$$;
Previously, SET TRANSACTION was possible only in SQL, but not in plpgsql.
With the patch, this is already possible, and this is also a step towards meeting migrants from Oracle.
Nested function calls (and DO) with transactions are now also possible
.Sectioning
The NTT team, Amit Langote, developers from 2ndQuadrant and others worked on this large series of patches. Back in November of last year, a
patch was added, adding hash partitioning. Now all the main types of partitioning in PostgreSQL is.
But the most important news is different: a series of patches allows you to create unique indexes, PRIMARY KEY globally on the entire partitioned table (you can find some information about this, for example:
unique indexes on partitioned tables . Therefore, you can create tables that refer to a partitioned table:
foreign keys and partitioned tables . It will be possible to update a partitioned table without thinking about whether a record will remain in the same section (version 10 would give an error):
UPDATE of partition key .
ON CONFLICT DO UPDATE for partitioned tables appeared.That is, you can handle a partitioned table almost as usual. Almost because opportunities will work only if the unique index includes the fields that make up the partition key. But it is a huge step anyway.
As for
Add support for tuple routing to foreign partitions , this patch, which automatically forwards the inserted records to external sections, is also important for those who will create systems with sharding based on new partitioning capabilities.
The most important skill of the optimizer is to effectively exclude from the plan sections in which there is obviously no data (patch
faster partition pruning in planner ). In the case where there are a lot of sections (and in real projects there are thousands, or even tens of thousands), the exclusion of sections (pruning) can seriously reduce the time for executing a request. It will be possible to eliminate undesirable ones at the execution stage (patch
Runtime Partition Pruning ), when the condition for hitting a particular section is not known in advance. This happens, for example, in queries with subqueries.
Partition-wise join for declarative partitioned tables is an implementation of algorithms for joining two partitioned tables. The connection occurs separately in sections, and then gets together. In many cases, this is faster than joining parent tables. Similarly, with
Partition-wise aggregation / grouping, aggregation covers the individual sections first, then the results are collected.
Among the amenities will be
the default section (Default Partition for Range) , which would fall all the records that fall outside the boundaries of the specified sections, so as not to stop every time due to an error. Automatic creation of sections for data whose range is not known in advance is not even planned (it can do the
pg_pathman extension).
JSON (B)
In this direction, efforts have been made for over 2 years by the Postgres Pro development team. Because patches are weighty and affect many postgres mechanisms, they are taken slowly by the community. PostgreSQL 11 includes 3 patches:
Jsonb transform for pl / perl and
Jsonb transform for pl / python by Anton Bykov (effective transformations of binary JSON by transferring them to Perl and Python functions) and
Cast jsonb to numeric, int, float, bool Anastasia Lubennikova (type conversion). But such key patches as
SQL / JSON support in PostgreSQL, or
SQL / JSON: jsonpath , or
SQL / JSON: functionsstill waiting. But this is support for the standard SQL / JSON.
In the context of JSON, we can mention the
Konstantin Knizhnik patch , useful for surjective functions that work with JSON, for example (info - >> 'name'). but may be useful for other purposes.
Parallel Gather and Sort when Creating B-tree Indexes
Gather speed-up works more efficiently with queues in memory, speeds up requests, especially simple ones.
Parallel tuplesort (for parallel B-Tree index creation) . This is also the January patch - parallel sorting of records for B-tree indexes.
index-only count (*) for indexes supporting bitmap scans (
A.Kuzmenkov , Postgres Professional) adopted at the end of last year. Queries of the form SELECT (*) ... WHERE ..., where the information necessary for the query is contained in the indexes in the expression, can now be significantly accelerated.
VACUUM
Not a fundamental change, but still: now you can run VACUUM of several tables with one command:
Allow users to specify multiple tables in VACUUM commands . The patch was adopted at the end of last year. At the same time, the most important patches concerning the
priorities of evacuation of various tables, while waiting for the evacuation
schedule .
Logical replication
It is not much advanced in vanilla PostgreSQL. TRUNCATE support added:
Logical decoding of TRUNCATEChecksums
Verify Checksums during Basebackups . Now you can check the checksums in the backup process (if checksums are included).
The contribution to version 11 of domestic developers is significant. But this is a topic for another story. In the meantime, thanks to all the developers (and reviewers) of the upcoming release!
[photo of the author. in the photo for a freeze, the
hero of the film “Leviathan” - the city of Kirovsk, the Kola Peninsula.]