📜 ⬆️ ⬇️

Example of restoring PostgreSQL tables using the new mega features pg_filedump



Let me tell you about a cool feature that my colleagues from Postgres Pro recently wrote down in the pg_filedump utility. This feature allows you to partially recover data from the database, even if the database was badly damaged and you will not be able to start a PostgreSQL instance with such a database. Of course, I want to believe that the need for such functionality is extremely rare. But just in case something like that I would like to have on hand. Read on to learn how this feature looks in action.

Partial data recovery was presented in commit 52fa0201 :
')
commit 52fa0201f97808d518c64bcb9696f2a350678aa5
Author: Teodor Sigaev <teodor@sigaev.ru>
Date: Tue Jan 17 16:01:12 2017 +0300

Partial data recovery (-D flag).

This feature allows to partially recover data from a given segment file
in format suitable for using in COPY FROM statement. List of supported
data types is currently not full and TOAST is not yet supported, but
it's better than nothing. Hopefully data recovery will be improved in
the future.

Implemented by Aleksander Alekseev, reviewed by Dmitry Ivanov, tested
by Dmitry Ivanov and Grigoriy Smolkin.


Suppose there is some kind of table:

 create table tt (x int, y bool, z text, w timestamp); 

... filled with some data:

 insert into tt values(123, true, 'Text test test', now()); insert into tt values(456, null, ' ', null); checkpoint; 

Here I say checkpoint, so that the data must fall on the disk. Otherwise, they will fall into WAL, but the buffer manager will keep them in memory until the taps (tuple, tuple, row in the table) are replaced by newer and / or frequently used taps. Or checkpoint timeout / accumulation max_wal. I think this is the most frequent script for sync pages to disk. - approx. Stas Kelvich .

We also find out the name of the segment corresponding to the table:

 select relfilenode from pg_class where relname = 'tt'; 

In my case, the table's relfilenode was 16393. Find the segment (or segments if the table is more than 1 GB) on the disk:

 find /path/to/db/ -type f | grep 16393 

Copy it somewhere and imagine that we want to restore the data, having only a segment file.

To do this, build the latest version of pg_filedump:

 git clone git://git.postgresql.org/git/pg_filedump.git cd pg_filedump make 

Usually, the base scheme is known, since there is an application on hand that works with it. So, we know the types of columns in the table and can decode them in this way:

 ./pg_filedump -D int,bool,text,timestamp /path/to/db/base/16384/16393 

Example output:

 ******************************************************************* * PostgreSQL File/Block Formatted Dump Utility - Version 9.6.0 * * File: /home/eax/work/postgrespro/postgresql-install/data-master/base/16384/16393 * Options used: -D int,bool,text,timestamp * * Dump created on: Tue Jan 17 16:28:07 2017 ******************************************************************* Block 0 ******************************************************** <Header> ----- Block Offset: 0x00000000 Offsets: Lower 32 (0x0020) Block: Size 8192 Version 4 Upper 8080 (0x1f90) LSN: logid 0 recoff 0x0301e4c0 Special 8192 (0x2000) Items: 2 Free Space: 8048 Checksum: 0x0000 Prune XID: 0x00000000 Flags: 0x0000 () Length (including item array): 32 <Data> ------ Item 1 -- Length: 56 Offset: 8136 (0x1fc8) Flags: NORMAL COPY: 123 t Text test test 2017-01-17 16:25:03.448488 Item 2 -- Length: 52 Offset: 8080 (0x1f90) Flags: NORMAL COPY: 456 \N   \N *** End of File Encountered. Last Block Read: 0 *** 

There is quite a lot of data here, since pg_filedump prints information about each page in a segment and decodes the header of each tag. Fortunately, you can quite easily separate flies from cutlets, for example, like this:

 pg_fiedump -D ........ | grep COPY | perl -lne 's/^COPY: //g; print;' > /tmp/copy.txt cat /tmp/copy.txt 

Content of the copy.txt file:

 123 t Text test test 2017-01-17 16:25:03.448488 456 \N   \N 

This is our table data in a format suitable for use in the COPY FROM query. Checking:

 create table tt2 (x int, y bool, z text, w timestamp); copy tt2 from '/tmp/copy.txt'; select * from tt2; 

Result:

  x | y | z | w -----+---+----------------+---------------------------- 123 | t | Text test test | 2017-01-17 16:25:03.448488 456 | |   | (2 rows) 

As you can see, all data was successfully restored.

Naturally, it was a somewhat simplified example and in practice more and more difficult. First, the list of supported types is currently somewhat limited:

 static ParseCallbackTableItem callback_table[] = { { "smallserial", &decode_smallint }, { "smallint", &decode_smallint }, { "int", &decode_int }, { "serial", &decode_int }, { "bigint", &decode_bigint }, { "bigserial", &decode_bigint }, { "time", &decode_time }, { "timetz", &decode_timetz }, { "date", &decode_date }, { "timestamp", &decode_timestamp }, { "float4", &decode_float4 }, { "float8", &decode_float8 }, { "float", &decode_float8 }, { "bool", &decode_bool }, { "uuid", &decode_uuid }, { "macaddr", &decode_macaddr }, /* internally all string types are stored the same way */ { "char", &decode_string }, { "varchar", &decode_string }, { "text", &decode_string }, { "json", &decode_string }, { "xml", &decode_string }, { NULL, NULL}, }; 

Secondly, TOAST is not currently supported. If the string is stored uncompressed or compressed on the in-place page, pg_filedump will successfully restore it (if the compressed data has not been corrupted). However, if the row was moved to an external TOAST table, instead of the row, you simply get "(TOASTED)". In principle, TOAST support is not an unsolvable task. You only need to teach the pg_filedump parsing directory and find the corresponding TOAST table. Just so far no one has done this. Perhaps TOAST support will be added in future versions of pg_filedump.

Finally, in practice, the database schema sometimes changes, columns in the table appear and disappear. Deleting columns is not such a big problem, since physically this column remains in the tapl, it is just always null. Here, adding a little more difficult, because because of it, tuples within a single table can have a variable number of attributes. If the number of attributes in the tile does not match the number of attributes specified by the user, pg_filedump simply displays a warning with partially decoded data and proceeds to the next one. This means that in practice, parsing the output of pg_filedump will be a little more difficult, well, or that you will have to run it several times with different lists of attributes.

In my opinion, and not only mine, as an extreme means of data recovery, it’s better to have at least one that you don’t have any :) If you have ideas for further improvement of the functionality presented, and indeed any comments and additions, I’ll be extremely interesting to read them in the comments!

You may also be interested in articles:




Continued - Another new feature pg_filedump: restore the PostgreSQL directory .

Source: https://habr.com/ru/post/319770/


All Articles