Failover Evolution in PostgreSQL

We are actively preparing for PG Day'17 , we are expanding the theme of the conference, so soon you will find a large number of interesting posts not only about PostgreSQL, but also about other widely used databases. Today we want to bring to your attention a translation of the article Gulcin Yildirim, which served as the basis for its report on PG Conf Europe'16.

“No matter how paradoxical it may sound, but the truth is that the more we learn, the more ignorant we become in the absolute sense, for it is only through enlightenment that we realize our limitations. More specifically, one of the most gratifying results of intellectual evolution is the constant discovery of new and broader perspectives. ” Nikola Tesla

')
PostgreSQL is a terrific project that is evolving with amazing speed. In this series of posts, we will focus on the evolution of fault tolerance in PostgreSQL throughout its versions.

PostgreSQL in a nutshell

PostgreSQL is fault tolerant by nature. First, it is an advanced open source database management system, and this year will celebrate its 20th anniversary [app. Lane: anniversary was held in 2016, a solemn greeting from the Russian community was held at the final evening of PG Day'16 Russia]. Therefore, it is a proven technology with an active community, thanks to which it is actively developing.

PostgreSQL is SQL compatible (SQL: 2011) and fully meets the requirements of ACID (atomicity, consistency, isolation, reliability).

Note : A (tomicity) C (onsistency) I (solation) D (urability) in PostgreSQL

Atomicity ensures that the results of the transaction will be visible completely or not at all visible within other transactions, but the transaction must not be atomic with respect to itself. PostgreSQL is consistent , and system-defined constraints dictated by the principle of consistency apply to the results of transactions. That is, after the completion of the transaction data must remain in a consistent state. Transactions performed simultaneously do not affect each other, which speaks of isolation (we will talk about transaction isolation levels a bit later in this post). When a transaction is completed, its results will not be lost, regardless of subsequent failures, and this makes PostgreSQL reliable .

PostgreSQL allows you to create physical and logical replicas and has built-in physical and logical solutions for this. We will talk about replication methods in PostgreSQL in terms of fault tolerance in the following posts.

PostgreSQL allows you to perform synchronous and asynchronous transactions, PITR (Point-in-time Recovery - point-in-time recovery) and MVCC (Multiversion concurrency control - Multi-Version Control). All of these concepts relate to a certain degree to fault tolerance, and I will try to describe their impact in the process of explaining the basic concepts and their applications in PostgreSQL.

PostgreSQL is reliable!

All actions in the database are performed within transactions that are protected by a transaction log that will perform automatic disaster recovery in the event of software failure.

Optionally, you can create databases with data blocks with checksums ( data block checksums ) to diagnose hardware problems. There are many backup mechanisms with full and detailed PITR in case detailed restoration is required. Various diagnostic tools are also available.

Database replication is initially supported. Synchronous replication , if properly configured and managed, provides a higher degree of availability and data protection than “5 nines” (99.999%) .

Taking into account all the above facts, it is easy to say that PostgreSQL is reliable!

PostgreSQL fault tolerance: WAL

WAL - write ahead logging - is the main fault tolerance system for PostgreSQL.

WAL consists of a series of binary files recorded in the pg_xlog subdirectory of the PostgreSQL data directory. Every change made to the database is first recorded in the WAL, hence the name “proactive” log, consonant with the “transaction log”. When a transaction is completed, the default behavior — and the most secure — will force WAL writes to disk.

In the event of a failure, PostgreSQL WAL will be replayed, which will return the database at the time of the completion of the last transaction and, thus, ensure the safety of any changes to the database.

Transaction? Commit?

Changes in the database themselves are not written to disk at the time of completion of the transaction. They are recorded after some time by the background processes of the writer & checkpointer on a well-tuned server. (See the description of WAL above)

Transactions are the fundamental concept of all database systems. A distinctive feature of a transaction is that it links several steps into a single all-or-nothing operation.

Note : Transactions in PostgreSQL

PostgreSQL actually treats each SQL query as being executed within a transaction. If you do not write the BEGIN command, then each individual request will have implicit BEGIN commands and (if successful) COMMIT at the beginning and at the end, respectively. A group of requests wrapped in BEGIN and COMMIT is sometimes called a transaction block.

Intermediate states between steps are not visible to other transactions running in parallel, so if any failure occurs that interferes with the transaction, no step will affect the database as a whole. PostgreSQL does not support dirty reads ( dirty-reads — a transaction reads data written by a parallel pending transaction ).

Note: Transaction Isolation

The SQL standard defines 4 levels of transaction isolation: Read incomplete, Read complete, Reread, Serialize.

Table 1: Standard SQL transaction isolation levels
Isolation level "Dirty" reading Unique read Phantom read Serialization anomaly
Reading incomplete Allowed, but not in PG maybe maybe maybe
Read complete Is impossible maybe maybe maybe
Re-read Is impossible Is impossible Allowed, but not in PG maybe
Serialization Is impossible Is impossible Is impossible Is impossible

The most stringent is the Serialization level, which is defined by the standard in the paragraph that any simultaneous execution of serializable transactions is guaranteed to have the same effect as their sequential execution in the same order.

For more information on this topic, see the Postgrese transaction isolation documentation.

Isolation level	"Dirty" reading	Unique read	Phantom read	Serialization anomaly
Reading incomplete	Allowed, but not in PG	maybe	maybe	maybe
Read complete	Is impossible	maybe	maybe	maybe
Re-read	Is impossible	Is impossible	Allowed, but not in PG	maybe
Serialization	Is impossible	Is impossible	Is impossible	Is impossible

Check Point

Disaster recovery reproduces WAL, but from what point does recovery begin?

Recovery begins with points in the WAL, known as checkpoints . The duration of disaster recovery depends on the number of changes in the transaction log since the last checkpoint. The checkpoint is a known, safe start point for recovery, since it ensures that previous database changes have already been written to disk.

Checkpoint can be immediate or scheduled. Immediate control points appear due to any actions of a super user (superuser), for example, with the command CHECKPOINT or others. Scheduled test points are set automatically by Postgres.

Conclusion

In this post, we have listed the important PostgreSQL functions related to its fault tolerance. Proactive logging, transaction, isolation levels, test points, and disaster recovery were mentioned. In the next post, we’ll continue the theme with a story about replication in PostgreSQL.

Source: https://habr.com/ru/post/317132/

All Articles