Another new feature pg_filedump: restore the PostgreSQL directory

In the last article, we learned how using the pg_filedump utility you can recover data, or at least some part of it, from a completely killed PostgreSQL database. It was assumed that we know the numbers of the segments corresponding to the table from somewhere. If we know part of the contents of the table, its segments are really not difficult to find, for example, with a simple grep. However, in a more general case, this is not so easy to do. In addition, it was assumed that we know the exact scheme of the tables, which is also far from a fact. So, recently, my colleagues and I have made a new patch for pg_filedump, which allows us to solve these problems.

So let's say we want to restore the table named test. If we do not remember the name of the table, it is not scary, so using the method described below, you can get the names of all the tables in the database. Information about the tables is stored in the pg_class catalog table, the segment of which is always numbered 1259.

Using the latest version of pg_filedump, we can read pg_class as follows:

./pg_filedump -D name,oid,oid,oid,oid,oid,oid,~ /path/to/base/16384/1259 | grep COPY | grep test

Notice the list of types for decoding that we pass to pg_filedump:

 name,oid,oid,oid,oid,oid,oid,~

Here we first give the type names of the first seven columns of the table (the pg_class scheme is known and described in the documentation ), and the tilde says to ignore the other columns. In this case, they are not interesting to us anyway, there is no need to list them all.

Example output:

 COPY: test 2200 16387 0 10 0 16385 COPY: test 2200 16387 0 10 0 16385 COPY: test_pkey 2200 0 0 10 403 16391

The last column is relfilenode, that is, the segment number. He is what we need! Remember, 16385.

But wait, we don’t know the schema of the table. The pg_attribute catalog table will help us to know it, the relfilenode of which is hardcocked and equal to 1249. By the way, you can look at the relfilenode of all catalog tables in the pg_class.h file .

Open the dock by pg_attribute , decode:

 ./pg_filedump -D oid,name,oid,int,smallint,~ /path/to/base/16384/1249 | grep COPY | grep 16385

Example output:

 COPY: 16385 k 23 -1 4 COPY: 16385 v 25 -1 -1 COPY: 16385 ctid 27 0 6 COPY: 16385 xmin 28 0 4 COPY: 16385 cmin 29 0 4 COPY: 16385 xmax 28 0 4 COPY: 16385 cmax 29 0 4 COPY: 16385 tableoid 26 0 4

As you can see, the table has two columns with the names k and v (the remaining columns are systemic, they are needed for the MVCC to work, and that’s all). Here, 23 and 25 are atttypid, that is, column types. But how to understand what these types are?

The answer is contained in the pg_type catalog table (relfilenode = 1247, dock ):

 ./pg_filedump -i -D name,~ /path/to/base/16384/1247 | grep -A5 -E 'OID: (23|25)'

Example output:

  XMIN: 1 XMAX: 0 CID|XVAC: 0 OID: 23 Block Id: 0 linp Index: 8 Attributes: 30 Size: 32 infomask: 0x0909 (HASNULL|HASOID|XMIN_COMMITTED|XMAX_INVALID) t_bits: [0]: 0xff [1]: 0xff [2]: 0xff [3]: 0x07 COPY: int4 -- XMIN: 1 XMAX: 0 CID|XVAC: 0 OID: 25 Block Id: 0 linp Index: 10 Attributes: 30 Size: 32 infomask: 0x0909 (HASNULL|HASOID|XMIN_COMMITTED|XMAX_INVALID) t_bits: [0]: 0xff [1]: 0xff [2]: 0xff [3]: 0x07 COPY: text

So, now we have all the necessary information on hand. The table is called test, it has relfilenode 16385 and contains two columns - k with int4 type and v with text type. Now we can dump its contents as described in the previous article .

I hope that in practice you will never need this knowledge :) If you have questions or additions, I will be glad to read them in the comments!

Source: https://habr.com/ru/post/323644/

All Articles

Another new feature pg_filedump: restore the PostgreSQL directory

More articles: