Do you remember how it all started. It was the first time and again.

How it was necessary to be engaged in optimization of request of PostgreSQL and that from all this turned out.
Why did you have to? Yes, because the previous 4 years everything worked quietly, calmly, as the clock ticked.

As an epigraph.

')
Based on real events.
All names are changed, coincidences are random.

When a certain result is achieved, it is always interesting to remember that something was the impetus for the beginning, where it all began.

So, what happened as a result was briefly described in the article “ Synthesis as one of the methods for improving PostgreSQL performance ”.

It will probably be fun to recreate the chain of previous events.

History has kept the exact start date - 2018-09-10 18:02:48.

Also in the history there is a query that started it all:

Problem request

SELECT
p. "PARAMETER_ID" as parameter_id,
pd. "PD_NAME" AS pd_name,
pd. "CUSTOMER_PARTNUMBER" AS customer_partnumber,
w. "LRM" AS LRM,
w. "LOTID" AS lotid,
w. "RTD_VALUE" AS RTD_value,
w. "LOWER_SPEC_LIMIT" AS lower_spec_limit,
w. "UPPER_SPEC_LIMIT" AS upper_spec_limit,
p. "TYPE_CALCUL" AS type_calcul,
s. "SPENT_NAME" AS spent_name,
s. "SPENT_DATE" AS spent_date,
extract (year from "SPENT_DATE") AS year,
extract (month from "SPENT_DATE") as month,
s. "REPORT_NAME" AS report_name,
p. “STPM_NAME” AS stpm_name,
p. "CUSTOMERPARAM_NAME" AS customerparam_name
FROM wdata w,
spent s,
pmtr p,
spent_pd sp,
pd pd
WHERE s. "SPENT_ID" = w. "SPENT_ID"
AND p. "PARAMETER_ID" = w. "PARAMETER_ID"
AND s. "SPENT_ID" = sp. "SPENT_ID"
AND pd. "PD_ID" = sp. "PD_ID"
AND s. "SPENT_DATE"> = '2018-07-01' AND s. "SPENT_DATE" <= '2018-09-30'
and s. "SPENT_DATE" = (SELECT MAX (s2. "SPENT_DATE")
FROM spent s2,
wdata w2
WHERE s2. "SPENT_ID" = w2. "SPENT_ID"
AND w2. "LRM" = w. "LRM");

Problem description, predictably standard - “Everything is bad. Tell me what the problem is. ”
Immediately I remembered the anecdote of drive times by 3 and a half inches:

Lamer comes to the hacker.
- Nothing works for me, tell me where the problem is.
- In the DNA ...

But so it is impossible to solve incidents of performance, of course. “ They may not understand us ” (c). Need to understand.

Well, we will dig. Maybe that accumulates as a result.

Investigation started

So, what can be seen immediately with the naked eye, without even resorting to the help of EXPLAIN.

Not used JOIN. This is bad, especially if the number of connections is more than one.
But worse, the correlated subquery, in addition, with the aggregation. This is very bad.

This is bad of course. But this is only on the one hand. On the other hand, this is very good, because the task definitely has a solution and the query can be improved.

Do not go to the fortuneteller (C).

The query plan is not so complicated, but quite indicative:

Execution plan

The most interesting and useful, as usual, at the beginning and end.

Nested Loop (cost=935.84..479763226.18 rows=3322 width=135) (actual time=31.536..8220420.295 rows=8111656 loops=1) Planning time: 3.807 ms

Execution time: 8222351.640 ms
Run time more than 2 hours.

False hypotheses taking time

Hypothesis 1 - Optimizer is mistaken, builds wrong plan.

To visualize the execution plan, we will use the site https://explain.depesz.com/ . However, the site did not show anything interesting or useful. At first and second glance - nothing that could really help. Is that - Full Scan minimum. Go ahead.

Hypothesis 2-Impact on the base of the autovacuum, you need to get rid of the brakes

But, the autovacuum demons behave well, there are no long-hanging processes. Some serious load is not. Need to look for something else.

Hypothesis 3-Statistics is outdated, you need to recount all flies

Again, not that. Statistics is up to date. That, given the lack of problems with the autovacuum, is not surprising.

We begin to optimize

The main table 'wdata' is certainly not small, almost 3 million records.
And this table is full scan.

 Hash Cond: ((w."SPENT_ID" = s."SPENT_ID") AND ((SubPlan 1) = s."SPENT_DATE")) -> <b>Seq Scan</b> on wdata w (cost=0.00..574151.49 rows=26886249 width=46) (actual time=0.005..8153.565 rows=26873950 loops=1)

We act in the standard way: “come on, let's make an index and everything flies”.

Made an index on the field "SPENT_ID"

As a result:

Index query execution plan

Well, helped?

It was: 8 222 351.640 ms (a little more than 2 hours)
It became: 6 985 431.575 ms (almost 2 hours)
In general, the same apples, side view.

Remember the classics:

“Do you have one, the same, but without wings? Will seek".

In principle, this could be called a good result, well, not a good one, but acceptable. At the very least, to provide a large report to the customer with a description of how much has been done and why what has been done is good.

Still, the final decision is still far away. Very far.

And now the most interesting thing is that we continue to optimize, we will polish the request

Step one - use JOIN

The rewritten query now looks like this ( well, at least more beautiful ):

JOIN query

SELECT
p. "PARAMETER_ID" as parameter_id,
pd. "PD_NAME" AS pd_name,
pd. "CUSTOMER_PARTNUMBER" AS customer_partnumber,
w. "LRM" AS LRM,
w. "LOTID" AS lotid,
w. "RTD_VALUE" AS RTD_value,
w. "LOWER_SPEC_LIMIT" AS lower_spec_limit,
w. "UPPER_SPEC_LIMIT" AS upper_spec_limit,
p. "TYPE_CALCUL" AS type_calcul,
s. "SPENT_NAME" AS spent_name,
s. "SPENT_DATE" AS spent_date,
extract (year from "SPENT_DATE") AS year,
extract (month from "SPENT_DATE") as month,
s. "REPORT_NAME" AS report_name,
p. “STPM_NAME” AS stpm_name,
p. "CUSTOMERPARAM_NAME" AS customerparam_name
FROM wdata w INNER JOIN spent s ON w. "SPENT_ID" = s. "SPENT_ID"
INNER JOIN pmtr p ON p. "PARAMETER_ID" = w. "PARAMETER_ID"
INNER JOIN spent_pd sp ON s. "SPENT_ID" = sp. "SPENT_ID"
INNER JOIN pd pd ON pd. "PD_ID" = sp. "PD_ID"
WHERE
s. "SPENT_DATE"> = '2018-07-01' AND s. "SPENT_DATE" <= '2018-09-30'AND
s. "SPENT_DATE" = (SELECT MAX (s2. "SPENT_DATE")
FROM wdata w2 INNER JOIN spent s2 ON w2. "SPENT_ID" = s2. "SPENT_ID"
INNER JOIN wdata w
ON w2. "LRM" = w. "LRM");
Planning time: 2.486 ms
Execution time: 1223680.326 ms

So, the first result.

It was: 6 985 431.575 ms (almost 2 hours).
It became: 1 223 680.326 ms (a little more than 20 minutes).

Good result. In principle, again, it would be possible to stop there. But so uninteresting, you can not stop.

FOR

Step two - get rid of the correlated subquery

Modified request text:

No correlated subquery

SELECT
p. "PARAMETER_ID" as parameter_id,
pd. "PD_NAME" AS pd_name,
pd. "CUSTOMER_PARTNUMBER" AS customer_partnumber,
w. "LRM" AS LRM,
w. "LOTID" AS lotid,
w. "RTD_VALUE" AS RTD_value,
w. "LOWER_SPEC_LIMIT" AS lower_spec_limit,
w. "UPPER_SPEC_LIMIT" AS upper_spec_limit,
p. "TYPE_CALCUL" AS type_calcul,
s. "SPENT_NAME" AS spent_name,
s. "SPENT_DATE" AS spent_date,
extract (year from "SPENT_DATE") AS year,
extract (month from "SPENT_DATE") as month,
s. "REPORT_NAME" AS report_name,
p. “STPM_NAME” AS stpm_name,
p. "CUSTOMERPARAM_NAME" AS customerparam_name
FROM wdata w INNER JOIN spent s ON s. "SPENT_ID" = w. "SPENT_ID"
INNER JOIN pmtr p ON p. "PARAMETER_ID" = w. "PARAMETER_ID"
INNER JOIN spent_pd sp ON s. "SPENT_ID" = sp. "SPENT_ID"
INNER JOIN pd pd ON pd. "PD_ID" = sp. "PD_ID"
INNER JOIN (SELECT w2. "LRM", MAX (s2. "SPENT_DATE")
FROM spent s2 INNER JOIN wdata w2 ON s2. "SPENT_ID" = w2. "SPENT_ID"
GROUP BY w2. "LRM"
) md on w. "LRM" = md. "LRM"
WHERE
s. "SPENT_DATE"> = '2018-07-01' AND s. "SPENT_DATE" <= '2018-09-30';
Planning time: 2.291 ms
Execution time: 165021.870 ms

It was: 1,223,680.326 ms (a little more than 20 minutes).
It became: 165 021.870 ms (a little more than 2 minutes).

This is already quite good.

However, as the English say " But, there is always a but ." Too good result, should automatically arouse suspicion. Something is wrong here.

The hypothesis about correcting a query in order to get rid of a correlated subquery is correct. But it is necessary to modify a little so that the final result is correct.

In summary, the first intermediate result:

Edited query without correlated subquery

SELECT
p. "PARAMETER_ID" as parameter_id,
pd. "PD_NAME" AS pd_name,
pd. "CUSTOMER_PARTNUMBER" AS customer_partnumber,
w. "LRM" AS LRM,
w. "LOTID" AS lotid,
w. "RTD_VALUE" AS RTD_value,
w. "LOWER_SPEC_LIMIT" AS lower_spec_limit,
w. "UPPER_SPEC_LIMIT" AS upper_spec_limit,
p. "TYPE_CALCUL" AS type_calcul,
s. "SPENT_NAME" AS spent_name,
s. "SPENT_DATE" AS spent_date,
extract (year from s. "SPENT_DATE") AS year,
extract (month from s. "SPENT_DATE") as month,
s. "REPORT_NAME" AS report_name,
p. “STPM_NAME” AS stpm_name,
p. "CUSTOMERPARAM_NAME" AS customerparam_name
FROM wdata w INNER JOIN spent s ON s. "SPENT_ID" = w. "SPENT_ID"
INNER JOIN pmtr p ON p. "PARAMETER_ID" = w. "PARAMETER_ID"
INNER JOIN spent_pd sp ON s. "SPENT_ID" = sp. "SPENT_ID"
INNER JOIN pd pd ON pd. "PD_ID" = sp. "PD_ID"
INNER JOIN (SELECT w2. "LRM", MAX (s2. "SPENT_DATE") AS "SPENT_DATE"
FROM spent s2 INNER JOIN wdata w2 ON s2. "SPENT_ID" = w2. "SPENT_ID"
GROUP BY w2. "LRM"
) md ON md. "SPENT_DATE" = s. "SPENT_DATE" AND md. "LRM" = w. "LRM"
WHERE
s. "SPENT_DATE"> = '2018-07-01' AND s. "SPENT_DATE" <= '2018-09-30';
Planning time: 3.192 ms
Execution time: 208014.134 ms

So, what we have in the end is the first acceptable result, which is not ashamed to show the customer
Started with: 8 222 351.640 ms (more than 2 hours)

Achieved: 1 223 680.326 ms (a little more than 20 minutes).
Total (intermediate): 208 014.134 ms (a little more than 3 minutes).

Excellent result.

Total

At this could be to stop.

BUT…

Appetite comes with eating. Master the road going. Any result is intermediate. Stopped, died. Etc.

And let's continue the optimization.

Great idea. Especially, given that the customer was very much against it. And even strongly - for.

So, it's time to change the database design. The query structure itself is no longer to be optimized (although, as it turned out, there is an option for everything to really fly). But now to do the optimization and development of database design, this is a very promising idea. And most importantly interesting. Again, youthful recall. I after all did not immediately become DBA, I grew up from programmers (basic, assembler, si, si, twice plus, oracle, plsql). An interesting topic of course, for individual memoirs ;-).

However, we will not be distracted.

So,

And maybe sectioning will help us?

Spoiler - “Yes, it helped, and in optimizing performance, including.”

But that's another story ...

To be continued…

Source: https://habr.com/ru/post/446314/

All Articles