On September 15, Avito hosted a mitap, where we talked about scaling applications on PostgreSQL. Today I want to share materials from him - videos, presentations from speakers, show photos. Also under the cut publish the analysis of the quiz questions that we conducted here on Habré, in front of the mitap. And I talk about my impressions of the meeting.
Stas told about distributed transactions and time travel.
Reviews:
Stas and his team did an excellent job! I hope their decision will be approved by the community and we will see this solution in the new Postgres version.
I gave a talk about scaling the application on PostgreSQL in Avito and shared our tips and tricks.
Reviews:
An interesting approach: in which Kostantin very fascinatingly and intelligibly explained what problems can be encountered when working with data in the microservice architecture, and also suggested ways to solve when scaling an IP. Saga remembered :)
Michael prepared a report on logical replication and PostgreSQL transaction isolation levels.
Reviews:
Michael highlighted the subtle moments of transactions that are not immediately visible not only to beginners. Everyone needs to know about it.
Sergey told the audience about the structure of OZO, an asynchronous type-safe header-only PostgreSQL client library for C ++ 17, and invited a contributor to it.
Reviews:
The author, in my opinion, managed in a very short time to sufficiently review problems in existing libraries and solutions in new C ++ libraries. Therefore, I will be glad if these libraries will be developed in opensource, all the more basic things have already been implemented, which is good news.
Before the mitap, we suggested you answer questions about Postgres. Today I want to show the correct answers. They are under the spoilers (just in case).
There is an empty table with no users ("UserId" int, "balance" int) records. What will be returned as a result of the request?
with ins as ( insert into users select gs, gs * 10 from generate_series(1, 4) gs where gs%2 = 0) select * from users;
Nothing.
That returns the query select * from users where UserId = 10;
when accessing the users table after the previous task?
ERROR: column "userid" does not exist.
Enum CREATE TYPE status AS ENUM ('wait', 'init', 'run', 'stop') is defined; Which command can remove the value of 'init'?
There is no standard way to remove a value from enum.
How can I get a list of functions in PostgreSQL?
select * From pg_proc;
What will be returned as a result of the request?
select null = null, null is null, 1::smallint::boolean is true, null::bigint > 1
ERROR: cant be cast type smallint to boolean.
Junior developer Vasya was instructed to write a query that displays all entries from the parent
table for which there are no entries in the child
table.
Data scheme:
create table parent (parent_id serial primary key, payload text); create table child (child_id serial primary key, parent_id integer unique references parent (parent_id));
Vasya tried very hard and did not want to lose his face, so he invented eight different requests to solve the problem:
-- 0 select p.parent_id, p.payload from parent p where not exists(select from child c where c.parent_id = p.parent_id); -- 1 select p.parent_id, p.payload from parent p where not (array[p.parent_id] && array(select c.parent_id from child c)); -- 2 select distinct p.parent_id, p.payload from parent p full join child c on (c.parent_id = p.parent_id) where c.parent_id is null; -- 3 select p.parent_id, p.payload from parent p where p.parent_id not in (select c.parent_id from child c); -- 4 select p.parent_id, p.payload from parent p left join child c on (c.parent_id = p.parent_id) where c.parent_id is null; -- 5 with w_child_with_parents as ( select c.parent_id, ( select count(*) from parent p where c.parent_id = p.parent_id) = 1 as parent_exists from child c) select p.parent_id, p.payload from parent p where p.parent_id in (select pc.parent_id from w_child_with_parents pc where not pc.parent_exists); -- 6 select p.parent_id, p.payload from parent p full join child c on (c.parent_id = p.parent_id) group by p.parent_id, p.payload having count(c) = 0; -- 7 select p.parent_id, p.payload from parent p where p.parent_id in ( select p2.parent_id from parent p2 except all select c2.parent_id from child c2);
Vasya presented his options to you so that you could help him choose the best one. He claims that all requests work in the same way: tables are placed in memory and the performance difference is not significant (or even invisible). However, you, as a more experienced developer, have noticed that perhaps not all requests solve the problem. List the requests that do not solve the problem (and explain why).
The assigned task is not solved by requests 2, 3 and 5 (in some cases also request 1).
The “incorrectness” of behavior manifests itself when there are entries in the child table with parent_id is null.
insert into parent (parent_id, payload) values (1, 'payload 1'), (2, 'payload 2'), (3, 'payload 3'), (4, 'payload 4'), (5, 'payload 5'); insert into child (child_id, parent_id) values (1, 1), (2, 3), (3, null), (5, 5);
Request 1: if the intarray extension is installed in the database ( https://www.postgresql.org/docs/current/static/intarray.html ), the request crashes with the error "ERROR: array must not contain nulls". This behavior is due to the fact that the extension overrides the standard operators and changes the behavior for arrays containing null-elements.
The documentation says the following:
It should be noted that the operators &&, @> and <@ This restriction makes it possible for many operators.
Request 2: due to full join, an extra string (null, null) appears in the result.
Query 3: returns an empty resultset due to the fact that in the set formed by the subquery there are null elements.
Documentation ( https://www.postgresql.org/docs/current/static/functions-subquery.html#FUNCTIONS-SUBQUERY-NOTIN ):
If you’re not a lane, you’ll not be able to make it. This is in accordance with the SQL rules for the Boolean combinations of null values.
Request 5: returns an empty resultset because the w_child_with_parents section is run, the child and parent_id are empty or are not reflected in the section at all from the child table.
Three quiz questions correctly answered all the quiz questions. One we handed the prize at the mitap, two more sets of souvenirs went by mail.
More than a hundred people came to the mitap. It was very nice to meet such an audience. According to the survey, more than 60% of guests of the mitap have more than five years of experience working with databases. And it is very nice when reports receive such a lively response from the audience:
On the sidelines of the meeting they talked a lot about the fact that PostgreSQL is becoming an increasingly common tool. It really is. Taking this opportunity, I will say that we in Avito plan to expand the DBA team, and if you are interested in ambitious tasks on a large project, take a look at the vacancy on My Circle or write to me.
And in conclusion, I want to thank my colleagues from Yandex, Postgres Professional and, of course, Avito, for the wonderful presentations we have heard. Thanks to the guests who came to us on this Saturday day and live viewers. And of course, the #RuPostgres community for their trust.
Playlist with all reports here .
Photo reports we posted on Facebook and VKontakte .
See you again!
Source: https://habr.com/ru/post/424249/
All Articles