To begin with, all
your ads live in the PostgreSQL database. Until now, the lion's share of business logic is hidden in stored procedures, and it is not always convenient to control their work.

For us, stored procedures are convenient, primarily because you don’t need to transfer gigabytes of data between the database and the application. It is convenient to do several actions with different tables in the database, and in the application only to report that everything was completed successfully. This is really convenient, but at the same time it introduces a number of problems. Business logic is partially hidden in the database, the mechanisms that are used for debugging and monitoring on PHP / Go / Python / etc are not applicable on the DBMS side. Of course, we have our own wonderful tools, for example, pg_stat_statements, but sometimes they cannot fully answer the question of exactly which piece of code in our large and complex storage does not work that way. The solution we propose does not claim to be a “silver bullet”, but it can help to quickly determine the average execution time of pieces of code inside a stored procedure that runs thousands of times per second, and do this without creating an extra load. Interesting? Welcome!
At some point, we sharply and suddenly needed to track the execution time of certain code fragments in a large stored procedure. The procedure was complex, branched and very often used. I wanted a quick and elegant solution without long refactoring and unnecessary work. I wanted to see the result as clearly as possible, it would be ideal to get graphs of the frequency of calls in time, the volume of data being processed, the average and percentile of the execution time. And this is all in real time. We in the company decided to use the Brubeck / Graphite / Grafana stack to display such data, and we, first of all, began to think how to throw our metrics there.
')
The solution "in the forehead" - to write the stored procedure in plpython - disappeared immediately, to run the python interpreter in order to "spit" the string via UDP - an explicit search. The next option was to throw RAISE NOTICE, and then for a long time and painful to grind logs. Besides the fact that it looks like some kind of perversion, all the "realtime" is also lost. There was also an option to simply save the data in the unlogged table and dump the data in some Brubeck script. But adding entries to the loaded system, especially if the request is only for data sampling, is also not an option.
The decision came suddenly and unexpectedly. Accidentally remembered the mechanism LISTEN / NOTIFY and realized that this is it! For half an hour we wrote a demon on Python, which connects to PostgreSQL, listens to the channel and sends it to Brubeck, and inside the stored procedure we simply write to this channel.
And now, actually, an example of use:
x1 = clock_timestamp(); ...your long running query... x2 = clock_timestamp(); GET DIAGNOSTICS c = ROW_COUNT; perform pg_notify('channel_test', format(E'%s.%s:%s|%s\n', 'db.sql.metric', 'long_query.duration', extract(millisecond from (x2 - x1))::bigint::text, 'ms')); perform pg_notify('channel_test', format(E'%s.%s:%s|%s\n', 'db.sql.metric', 'long_query.count', c::text, 'c'));
Save the time before the request, save the time after the request, subtract one from the other, translate in milliseconds and send to the channel. Our demon on Python catches this message and sends further to Brubeck.
This solution has several advantages:
- The decision was invented, implemented and rolled into battle in 2 hours.
- There were no server restarts.
- Of external dependencies, only one daemon in python.
- Minimum overhead, on our N thousands of transactions per second, we never saw it.
Also, this solution has a number of features that are important to know about. All these features derive from the PostgreSQL LISTEN / NOTIFY mechanism device:
- The data you send to the channel depends on the transaction. If you sent something to the channel, and the transaction was rolled back, then nothing will come to the channel.
- LISTEN / NOTIFY does not work on standby.
- There are other features of NOTIFY in PostgreSQL, for example, that the same data will be delivered only once. All features are best seen in the LISTEN / NOTIFY documentation.
All this, of course, the little things that we personally get along well with.
But that is not all!
A week before this article was written, a PostgreSQL C extension for pg_metricus was invented and written.
This extension does not use LISTEN / NOTIFY and sends metrics to the aggregator socket immediately at the time of the call. Due to which the data became independent of the state of the transaction. Even if the transaction was rolled back, all your data successfully reached the aggregator.
How to use it? Yes, everything is simple!
We specify the host and port of our aggregator in the postgresql.conf settings:
pg_metricus.host = '10.4.5.177' pg_metricus.port = 8224
Do
select pg_reload_conf()
We collected source codes:
make install
Install the extension:
create schema metricus; create extension pg_metricus schema metricus;
And that's it! You can send data!
x1 = clock_timestamp(); ...your long running query... x2 = clock_timestamp(); GET DIAGNOSTICS c = ROW_COUNT; perform metricus.send_metric(format(E'%s.%s:%s|%s\n', 'db.sql.metric', 'long_query.duration', extract(millisecond from (x2 - x1))::bigint::text, 'ms')); perform metricus.send_metric(format(E'%s.%s:%s|%s\n', 'db.sql.metric', 'long_query.count', c::text, 'c'));
In conclusion, I would like to note that this approach allowed us to solve all the problems we voiced. We tested, rolled out into battle, and after 5 minutes we already had beautiful graphics in Grafana and the first idea of ​​how and how often and how long it runs in our studied logic.
The picture is clickableThanks for attention! We are waiting for your questions and feedback.
Projects are available at the links below:
Github
→
https://github.com/avito-tech/pg_metricus_python→
https://github.com/avito-tech/pg_metricus_cPGXN
→
https://pgxn.org/dist/pg_metricus