PostgreSQL 9.5: what's new? Part 1. INSERT ... ON CONFLICT DO NOTHING / UPDATE and ROW LEVEL SECURITY

Part 2. TABLESAMPLE
Part 3. GROUPING SETS, CUBE, ROLLUP
In the 4th quarter of 2015, PostgreSQL 9.5 is expected to be released. As always, the new version, ~~in addition to new bugs,~~ brings new features and "buns". In this article, two of them will be considered, namely INSERT ... ON CONFLICT DO NOTHING / UPDATE and Row-level security. The second alpha version has already been released, so the most impatient can install it and try new functionality.
You can download it here.

INSERT ... ON CONFLICT DO NOTHING / UPDATE

He is colloquially UPSERT. Allows in case of a conflict when inserting to update the fields or ignore the error.

What was previously proposed to be implemented using the stored function will now be available out of the box. You can use the ON CONFLICT DO NOTHING / UPDATE clause in an INSERT statement . In this case, the expression indicates separately conflict_target (by which field / condition the conflict will be considered) and conflict_action (what to do when the conflict occurred: DO NOTHING or DO UPDATE SET ).
')
The complete syntax for an INSERT statement is:

[ WITH [ RECURSIVE ] with_query [, ...] ] INSERT INTO table_name [ AS alias ] [ ( column_name [, ...] ) ] { DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query } [ ON CONFLICT [ conflict_target ] conflict_action ] [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ] where conflict_target can be one of: ( { column_name_index | ( expression_index ) } [ COLLATE collation ] [ opclass ] [, ...] ) [ WHERE index_predicate ] ON CONSTRAINT constraint_name and conflict_action is one of: DO NOTHING DO UPDATE SET { column_name = { expression | DEFAULT } | ( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] ) | ( column_name [, ...] ) = ( sub-SELECT ) } [, ...] [ WHERE condition ]

For us, the fun begins after ON CONFLICT.

Let's look at examples. Create a table in which the credentials of certain persons will lie:

 CREATE TABLE account ( id bigserial, name varchar, surname varchar, address varchar, PRIMARY KEY (id), CONSTRAINT unique_person UNIQUE (name, surname, address) ); Query returned successfully with no result in 31 ms.

Execute the insert request

 INSERT INTO account (id, name, surname, address) VALUES (1, '', '', ', ') ON CONFLICT (id) DO NOTHING; Query returned successfully: one row affected, 12 ms execution time. SELECT * FROM ACCOUNT;

id	name	surname	address
one	Vasya	Pupkin	Moscow Kremlin

Here, conflict_target is (id) and conflict_action is DO NOTHING .
If you try to execute this query a second time, then the insertion will not occur, and it will not give you any error message:

 Query returned successfully: 0 rows affected, 12 ms execution time.

If we did not specify ON CONFLICT (id) DO NOTHING , we would get an error:

 INSERT INTO account (id, name, surname, address) VALUES (1, '', '', ', '); ********** Error ********** ERROR: duplicate key value violates unique constraint "account_pkey" SQL state: 23505 Detail: Key (id)=(1) already exists.

The same behavior (as for ON CONFLICT (id) DO NOTHING ) will be for the query:

 INSERT INTO account (id, name, surname, address) VALUES (DEFAULT, '', '', ', ') ON CONFLICT (name, surname, address) DO NOTHING; Query returned successfully: 0 rows affected, 12 ms execution time.

In it, we already take the id value by default (from a sequence), but indicate another conflict_target — in the three fields that are subject to a unique constraint.

As mentioned above, you can also specify a conflict_target using the ON CONSTRAINT construct , specifying the constraint name itself:

 INSERT INTO account (id, name, surname, address) VALUES (DEFAULT, '', '', ', ') ON CONFLICT ON CONSTRAINT unique_person DO NOTHING; Query returned successfully: 0 rows affected, 11 ms execution time.

This is especially useful if you have an exclusion constraint, which you can refer to only by name, and not by a set of columns, as is the case with the restriction of uniqueness.

If you have built a partial unique index, then this can also be specified in the condition. Let only our people named Vasya will have unique combinations of the surname + address in our table:

 ALTER TABLE account DROP CONSTRAINT unique_person; CREATE UNIQUE INDEX unique_vasya ON account (surname, address) WHERE name='';

Then we can write this query:

 INSERT INTO account (id, name, surname, address) VALUES (DEFAULT, '', '', ', ') ON CONFLICT (surname, address) WHERE name='' DO NOTHING; Query returned successfully: 0 rows affected, 12 ms execution time.

And finally, if you want DO NOTHING to be triggered by any uniqueness / exception conflict on insertion, this can be written as follows:

 INSERT INTO account (id, name, surname, address) VALUES (DEFAULT, '', '', ', ') ON CONFLICT DO NOTHING; Query returned successfully: 0 rows affected, 12 ms execution time.

It is worth noting that it is impossible to specify multiple conflict_action , so if one of them is specified and the other is triggered, then there will be an error when pasting:

 INSERT INTO account (id, name, surname, address) VALUES (DEFAULT, '', '', ', ') ON CONFLICT (id) DO NOTHING; ********** Error ********** ERROR: duplicate key value violates unique constraint "unique_person" SQL state: 23505 Detail: Key (name, surname, address)=(, , , ) already exists.

Let us turn to the possibilities of DO UPDATE SET .

For DO UPDATE SET , unlike DO NOTHING, an indication of conflict_action is required.

The DO UPDATE SET construct updates the fields that are listed in it. The values of these fields can be set explicitly, set by default, obtained from a subquery, or taken from the special expression EXCLUDED , from which you can take data that was originally proposed for insertion.

 INSERT INTO account (id, name, surname, address) VALUES (1, '', '', ', ') ON CONFLICT (id) DO UPDATE SET name='', surname=''; Query returned successfully: one row affected, 11 ms execution time. SELECT * FROM ACCOUNT;

id	name	surname	address
one	Petya	Petrov	Moscow Kremlin

 INSERT INTO account AS a (id, name, surname, address) VALUES (1, '', '', ', ') ON CONFLICT (id) DO UPDATE SET name=EXCLUDED.name || ' ( ' || a.name || ')', surname=EXCLUDED.surname || ' ( ' || a.surname || ')'; Query returned successfully: one row affected, 13 ms execution time. SELECT * FROM ACCOUNT;

id	name	surname	address
one	Petya (former Vasya)	Petrov (former Pupkin)	Moscow Kremlin

 INSERT INTO account (id, name, surname, address) VALUES (1, '', '', ', ') ON CONFLICT (id) DO UPDATE SET name=DEFAULT, surname=DEFAULT; Query returned successfully: one row affected, 11 ms execution time. SELECT * FROM ACCOUNT;

id	name	surname	address
one	Null	Null	Moscow Kremlin

 INSERT INTO account (id, name, surname, address) VALUES (1, '', '', ', ') ON CONFLICT (id) DO UPDATE SET name=(SELECT some_field FROM other_table LIMIT 1);

The WHERE clause can also be used. For example, we want the name field not to be updated if the address field in the table row already contains the text “Kremlin”, otherwise it would be added:

 INSERT INTO account AS a (id, name, surname, address) VALUES (1, '', '', ', ') ON CONFLICT (id) DO UPDATE SET name=EXCLUDED.name WHERE a.name not like '%%'; Query returned successfully: 0 rows affected, 12 ms execution time.

And if we want the name field not to be updated, if the address field in the inserted data contains the text "Kremlin", otherwise, it was added:

 INSERT INTO account AS a (id, name, surname, address) VALUES (1, '', '', ',  ') ON CONFLICT (id) DO UPDATE SET name=EXCLUDED.name WHERE EXCLUDED.name not like '%%'; Query returned successfully: one row affected, 11 ms execution time. SELECT * FROM ACCOUNT

id	name	surname	address
one	Vasya	Null	Moscow Kremlin

ROW LEVEL SECURITY

Row-level security or row-level security is a mechanism for delimiting access to information to the database, which allows users to restrict access to individual rows in the tables.

This functionality may be interesting to those who use databases with a large number of users.

It works as follows: the rules for a specific table are described, according to which access to specific rows is limited when executing certain commands, using the CREATE POLICY expression. Each rule contains a certain logical expression that must be true for the string to be visible in the query. The rules are then activated using the ALTER TABLE ... ENABLE ROW LEVEL SECURITY statement. Then when trying to access, for example, when a SELECT query is made, it is checked whether the user has the right to access a specific row and if not, they are not shown to him. The superuser can see all lines by default, since his default flag is BYPASSRLS, which means that no checks will be performed for this role.

The syntax for the CREATE POLICY statement is :

 CREATE POLICY name ON table_name [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ] [ TO { role_name | PUBLIC | CURRENT_USER | SESSION_USER } [, ...] ] [ USING ( using_expression ) ] [ WITH CHECK ( check_expression ) ]

Rules are created for specific tables, so there can be several rules in the database with the same name for different tables.
After the FOR statement, it is indicated for which requests the rule is applied, the default is ALL , that is, for all requests.

After TO - for which roles, by default - PUBLIC , that is, for all roles.

Further, in the USING expression, a Boolean expression is specified, which must be true in order for the specific string to be visible to the user in queries that use already existing data ( SELECT , UPDATE , DELETE ). If the boolean expression returns null or false, then the string will not be visible.

The WITH CHECK statement specifies a Boolean expression that must be true for the query to add or modify data ( INSERT or UPDATE ) to succeed. If the boolean expression returns null or false, then there will be an error. The WITH CHECK clause is executed after BEFORE triggers (if present) and before any other checks. Therefore, if a BEFORE trigger modifies a string so that the condition does not return true, there will be an error. For UPDATE to succeed, it is necessary that both conditions return true, including if a conflict occurs in the INSERT ... ON CONFILCT DO UPDATE request and the request attempts to modify the data. If the WITH CHECK clause is omitted, the condition from the USING clause will be substituted instead.
Under conditions, aggregate or window functions cannot be used.

Usually, it is required to control access based on which database user requests the data, so we can use functions that return System Information ( System Information Functions ).

Let us turn to examples:

Add a db_user field to the account table, fill this field for an already existing record and add new records:

 ALTER TABLE account ADD COLUMN db_user varchar; Query returned successfully with no result in 16 ms. UPDATE account SET db_user='pupkin' WHERE surname=''; INSERT INTO account (name, surname, address, db_user) VALUES ('', '', ',  ', 'petrov'), ('', '', '-,  ', 'sidorov'); Query returned successfully: 2 rows affected, 31 ms execution time.

Create roles:

 CREATE ROLE pupkin WITH LOGIN PASSWORD 'pupkin'; CREATE ROLE petrov WITH LOGIN PASSWORD 'petrov'; Query returned successfully with no result in 31 ms.

Create a rule and enable RLS on the table:

 CREATE POLICY select_self ON account FOR SELECT USING (db_user=current_user); ALTER TABLE account ENABLE ROW LEVEL SECURITY; Query returned successfully with no result in 12 ms.

In this query, we created a rule according to which, the user in the SELECT query will see only those rows in which the value of the db_user field coincides with the name of the current database user.

Run a request from the postgres user:

 SELECT * FROM account

id	name	surname	address	db_user
one	Vasya	Pupkin	Moscow Kremlin	pupkin
five	Peter	Petrov	Moscow Red Square	petrov
6	Ivan	Sidorov	St. Petersburg, Winter Palace	sidorov

Perform the same request from the pupkin user:

id	name	surname	address	db_user
one	Vasya	Pupkin	Moscow Kremlin	pupkin

Let's create a rule according to which only the pupkin user can insert lines with the last name "Pupkin":

 CREATE POLICY insert_update_pupkin ON account WITH CHECK (surname<>'' OR current_user='pupkin')

Let's try to fulfill the request from the pupkin user:

 INSERT INTO account (name, surname, address) VALUES ('', '', ', ') Query returned successfully: one row affected, 13 ms execution time.

Check:

 select * from account;

id	name	surname	address	db_user
one	Vasya	Pupkin	Moscow Kremlin	pupkin

Op-pa! We forgot to specify the db_user field and the record that we inserted, we will not see. Well, let's fix this logic with the help of a trigger in which we fill in the db_user field with the name of the current user:

 CREATE OR REPLACE FUNCTION fill_db_user() RETURNS TRIGGER AS $BODY$ BEGIN NEW.db_user = current_user; RETURN NEW; END; $BODY$ LANGUAGE plpgsql VOLATILE; CREATE TRIGGER fill_db_user BEFORE INSERT ON account FOR EACH ROW EXECUTE PROCEDURE fill_db_user();

We try again:

 INSERT INTO account (name, surname, address) VALUES ('', '', ', '); select * from account;

id	name	surname	address	db_user
one	Vasya	Pupkin	Moscow Kremlin	pupkin
21	Ivan	Pupkin	Kiev, Maidan	pupkin

Let's try to change the data on Ivan Pupkin user petrov:

 UPDATE account SET db_user='petrov' WHERE id=21 Query returned successfully: 0 rows affected, 13 ms execution time.

As you can see, the data has not changed, it happened because the USING condition from the select_self rule was not fulfilled.

If several rules correspond to one request, they are combined via OR .

It is worth noting that the rules apply only when explicit requests are made to the tables and do not apply when the system performs checks (constaints, foreign keys, etc.). This means that the user, using queries, determine that any value exists in the database. For example, if a user can insert into a table that refers to another table, from which he cannot make a SELECT . In such a case, it may try to make an INSERT in the first table and by the result (it was inserted or an error occurred while checking referential integrity) to determine if a value exists in the second table.

There are many ways to use row-level security:

the same database is used by several applications with different functionality
multiple instances of the same application with different rights
access by roles or user groups
etc.

In the following parts, I plan to consider such new features of PostgreSQL 9.5 as:

Part 2. TABLESAMPLE
SKIP LOCKED
BRIN indices
GROUPING SETS, CUBE, ROLLUP
New features for JSONB
IMPORT FOREIGN SCHEMA
other

Source: https://habr.com/ru/post/264281/

All Articles

PostgreSQL 9.5: what's new? Part 1. INSERT ... ON CONFLICT DO NOTHING / UPDATE and ROW LEVEL SECURITY

INSERT ... ON CONFLICT DO NOTHING / UPDATE

ROW LEVEL SECURITY

More articles: