Sorry, I broke your recovery.conf

In PostgreSQL, starting from very old times, version 8.0, which was released back in 2005, used a special configuration file, recovery.conf , to restore to a specific point in time. The same file was later used for standby mode and streaming replication.

However, starting from the next PostgreSQL 12 release, more recovery.conf will not work: I broke it.
But why?

The recovery.conf had one feature: it was read only at the start of the DBMS. And if for point-in-time recovery which is needed less often than once a year, you can still accept this, then the need to restart the entire database to change the address of the replication server upstream is somewhat depressing. I saw different ways of perversions to circumvent this restriction, such as using L3 routing, schemes with cascading replications (in order to restart not all replicas, but at least only a part of them), and even (at least in production did not meet) walbouncer .

After the next scheduled restart of the replicas, it was only because of the need to change the replication parameters, I decided to pick it up, and what would it cost to teach PostgreSQL to re-read primary_conninfo by SIGHUP? ~~It turned out all bad.~~ In principle, it is necessary to change only one variable in the startup process and from there request the restart of WalReceiver - and that's it, replication will continue with the new address correctly. It remains to implement it correctly. A few weeks later, I finished the patch with the implementation of re-reading recovery.conf on the SIGHUP signal, while the patch did not break the existing behavior of the database.
')
Then, plucking up courage, sent him to the PostgreSQL developer mailing list. To which Michael Paquier answered rather quickly:

It doesn’t need to be a question.

Oops, it turned out that I asked the search engine the wrong question. It was necessary to ask not about re-reading recovery.conf, but about turning the parameters from a separate recovery.conf to the GUC (grand unified configuration) infrastructure used for all other DBMS parameters. That is definitely not, we don’t need such a patch, we don’t want that. Let's first transfer all these settings from recovery.conf to our standard configuration infrastructure.

On this humorless news, I grieved and thought: “But let's transfer!”. I read the archived discussions on the correct search query, opened the last found discussion of the settings transfer (the link was kindly provided by Michael Paquier in his answer, for which he was thankful for him, as well as for the quick reply). At that time in May 2018, the patch has been abandoned for more than a year. So, from here we start. Or is it more correct to say “continue”? According to the entertainment plan:

read and compile a list of edits to the latest published version of the patch
parse changes in the patch and transfer necessary to the current version of the code base
fix all references to recovery.conf and its parameters in the documentation
fix tests
send a new patch to the mailing list
get any feedback
fix something according to the wishes and return to point 5
receive again the refusal to accept the patch (after a year and a half)

Looks like an action plan? Well, and move on it!

Long, shortly, I got to point number five and on June 21, 2018 published a new version of the patch, in a new thread of discussions . Then 3 months in the oppressive silence of the chilling silence of the Baskervilles fish. At the end of September, Peter Eisentraut, one of the Core developers with the right to commit, was unexpectedly interested in the patch. After several iterations of edits, while I quietly and peacefully left for a few days for a walk-see Budapest and look at the sights, suddenly comes a discouraging letter from Peter Eisentraut:

Bunch of small refinements. I have also updated the documentation more extensively. The attached patch is committable to me.

That is, Peter Eisentraut corrected some more trivialities at his own discretion, updated the documentation, burned the entire recovery-config.sgml section, and believes that in this form the patch can already be accepted. Oh, and I thought it would happen only for postgresql 13, even if you are so lucky that the patch will even reach the state of readiness for commit.

And a few days later, namely on November 25 of this year 2018, Peter Eisentraut really accepts this patch :

recovery.conf settings are now set in postgresql.conf (or other GUC sources). Currently, all the affected settings are PGC_POSTMASTER; this could be refined in the future case.
Recovery is now initiated by a file recovery.signal. Standby mode is initiated by a file standby.signal. The standby_mode setting is gone. If a recovery.conf file is found, an error is issued.
The trigger_file setting has been renamed to promote_trigger_file as part of the move.
The documentation chapter "Recovery Configuration" has been integrated into "Server Configuration".
pg_basebackup -R now appends settings to postgresql.auto.conf and creates a standby.signal file.
Author: Fujii Masao <masao (dot) fujii (at) gmail (dot) com>
Author: Simon Riggs <simon (at) 2ndquadrant (dot) com>
Author: Abhijit Menon-Sen <ams (at) 2ndquadrant (dot) com>
Author: Sergei Kornilov <sk (at) zsrv (dot) org>

Two weeks have passed and this commit has not been rolled back. Amazing. And it seems they are not even going. Amazing. It is not known whether the community decides to change the behavior in any direction again, especially before the feature freeze release postgresql 12 in April, there is still some time. But it seems that we will not have more recovery.conf anyway.

So what has changed

First and foremost, the DBMS will refuse to start if it finds the recovery.conf file - this was done on purpose so that the user using one of the many old instructions did not wonder why the database ignores the configuration in this file.

The old standby_mode setting is missing. Now it, as well as the fact of the presence of recovery.conf to enable recovery mode, replace two files (of any content, usually empty):

$ PGDATA / recovery.signal is the ideological successor of standby_mode = off, recovery from the archive will be performed to the point specified in the configs;
$ PGDATA / standby.signal - respectively, standby_mode = on. This file we will see on all replicas

If the startup process of the database found both files, then we believe that we are in standby mode.

If for clarity, to reduce the changes in the parameters in the plate:

old recovery.conf	PostgreSQL 12+ postgresql.conf
primary_conninfo	primary_conninfo
primary_slot_name	primary_slot_name
trigger_file	promote_trigger_file
recovery_min_apply_delay	recovery_min_apply_delay
recovery_target	recovery_target
recovery_target_name	recovery_target_name
recovery_target_time	recovery_target_time
recovery_target_xid	recovery_target_xid
recovery_target_lsn	recovery_target_lsn
recovery_target_inclusive	recovery_target_inclusive
recovery_target_timeline	recovery_target_timeline
recovery_target_action	recovery_target_action
restore_command	restore_command
archive_cleanup_command	archive_cleanup_command
recovery_end_command	recovery_end_command

It can be noted that changed a little less than nothing. At the moment (with the sole exception of the prefix of promote_ for promote_trigger_file), all new parameters are named just like the old ones and take the same possible values. Although in reality there is still a change regarding the settings of the target for recovery. I don’t know if anyone used this before, but it was a documented behavior and you could specify multiple recovery_target, recovery_target_lsn, recovery_target_name, recovery_target_time or recovery_target_xid at the same time. For example,

recovery_target_lsn = '1/1D9FEA00' recovery_target_xid = '5238954'

Actually used to restore the last line from recovery.conf. More so it is impossible, the goal for recovery should be indicated at most one. However, due to the GUC infrastructure logic, you can still specify the same parameter several times:

 recovery_target_lsn = '1/1D9FEA00' recovery_target_lsn = '1/16AC7D0'

This is valid, we will recover to the setting value specified last.

And, in general, this is all that needs to be said about the changes visible from outside PostgreSQL. pg_basebackup -R (--write-recovery-conf) remained in its place and does what it was intended for, only now it will add the parameters to postgresql.auto.conf and create a file standby.signal instead of recovery.conf

Unfortunately, speaking of the fact that these are all the changes that are currently visible, this is what I mean. All new parameters are still set as PGC_POSTMASTER, that is, they can only be changed when starting PostgreSQL. As already mentioned, this was a requirement from the developer community: first transfer all the settings, and only then see if they can be changed on the running database. So now PostgreSQL is in a wonderful development stage, when the old behavior has already been broken, and the changes for the better have not yet been delivered.

I have already published a patch that will allow me to change primary_conninfo and primary_slot_name on the fly. Let's see what happens.

Sorry, I broke your recovery.conf

Source: https://habr.com/ru/post/432918/

All Articles

Sorry, I broke your recovery.conf

So what has changed

More articles: