📜 ⬆️ ⬇️

Reinitialize the PostgreSQL database cluster

The subject of the problem



When working with a database in PostgreSQL, you should not forget in which locale the database cluster was initialized - this is the name of the directory (usually /var/lib/pgsql/data ) in which the databases of all PostgreSQL installations are stored.


')

Problem



Today I ran into such a problem. In the sample query, when using the lower() function, lower() the Cyrillic text did not occur, while the English values ​​readily "decreased".

The first attempt to solve the problem



A google search on the Internet half-day gave information that it would be nice if the database we were looking for were encoded in UTF-8 (in my case, by mistake, it was in default SQL_ASCII ).

OK. No sooner said than done! Relatively quickly, instructions were found on how to re-create the database in a new encoding without losing data.

  [bash]
 # su - postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydatabase
 ~ pg_dump mydatabase -Ft -v -U postgres -f /tmp/mydatabase.tar
 ~ dropdb mydatabase --username postgres
 ~ createdb --encoding UNICODE mydatabase --username postgres
 ~ pg_restore /tmp/mydatabase.tar |  psql --dbname mydatabase --username postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydatabase


The audit showed that the data of this database did not deteriorate, but the required actions (lowering to lower case with the lower() function still did not occur.

Problem solving becomes interesting



Along with continued Google reading, it was understood that the database cluster of this PostgreSQL server was initialized in the “C” locale, and for lower() and upper() this means a lot!

I had to figure out how to re-initialize the cluster of databases without destroying the already existing databases and data in them. At the same time, this server is production - some data dumps merge into it per cron ( crontab ) once an hour. The good thing is that it is not so much a production that there would be no free “window” for re-initialization.

"Window" in 60 minutes and half an hour to prepare



Until the end of the working day, there was 1.5 hours left and one free “window” with a duration of 60 minutes.

I decided to start with a warm-up on a local laptop. It is worth mentioning the difference in operating systems: laptop - Ubuntu 8.10, server - CentOS 5. Having prepared three terminal windows and another text editor window, I started the preparatory work.

First, the first method needed to be split in two — dump existing data and restore it after the cluster was re-initialized.

The dump of bases took place without strong complaints, only a couple of times pg_drop swore at the connected users to the same databases (it was decided to close pgAdmin ).

Then the command was found ( in the case of Ubunt ) hidden in a rather unusual place ( /usr/lib/postgresql/8.3/bin ) command ( initdb ) and executed with the necessary parameters.

  [bash]
 # su - postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydatabase
 ~ pg_dump mydatabase -Ft -v -U postgres -f /tmp/mydatabase.tar
 ~ dropdb mydatabase --username postgres
 ~ initdb --locale = en_RU.utf8 data /



Damn! Error ...

Hmm, it turned out that you need to manually delete the contents of the database cluster directory.

A warning! Do not immediately rm -rf data/* . I realized this after I did it on the laptop, and after the recovery, I lost the user access rights to the server (which are stored in pg_hba.conf ).


You need to make a copy of the pg_hba.conf somewhere for the duration of the change.

  [bash]
 ~ cp data / pg_hba.conf /home/cr0t/pg_hba.2009.03.24_1654.conf


After deleting the contents of the directory and stopping the PostgreSQL daemon without error, the cluster was re-initialized.

  [bash]
 ~ exit
 # /etc/init.d/postgresql stop
 # su - postgres
 ~ rm -rf data / *
 ~ initdb --locale = en_RU.utf8 data /


It only remained to restart the server and restore from the dumps the old databases to the already new cluster, initialized in the “correct” locale.

  [bash]
 ~ exit
 # /etc/init.d/postgresql start
 # su - postgres
 ~ createdb --encoding UNICODE mydatabase --username postgres
 ~ pg_restore /tmp/mydatabase.tar |  psql --dbname mydatabase --username postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydatabase



Success. Results



After these successful actions, the lower() function began to correctly “press” Cyrillic characters. Everyone is happy. But I didn’t even think about the user roles of PostgreSQL (this is how users in the 8.x version became known). They are gone. Good thing I needed to create a whole couple of them. But who has a lot of them, be careful, do not repeat my mistakes!

PS Steps for re-initialization if there are multiple databases



In my case, it was necessary to backfire and subsequently restore 3 bases.
To solve this problem, it is necessary to add only additional repetitions of some actions for dumping and subsequent recovery ( with a strong desire, you can even write an automation script;) ).

  [bash]
 # su - postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydb1
 ~ pg_dump mydb1 -Ft -v -U postgres -f /tmp/mydb1.tar
 ~ dropdb mydb1 --username postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydb2
 ~ pg_dump mydb2 -Ft -v -U postgres -f /tmp/mydb2.tar
 ~ dropdb mydb2 --username postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydb3
 ~ pg_dump mydb3 -Ft -v -U postgres -f /tmp/mydb3.tar
 ~ dropdb mydb3 --username postgres
 ~ cp data / pg_hba.conf ./
 ~ exit
 # /etc/init.d/postgresql stop
 # su - postgres
 ~ rm -rf data / *
 ~ initdb --locale = en_RU.utf8 data /
 ~ cp pg_hba.conf data /
 ~ exit
 # /etc/init.d/postgresql start
 # su - postgres
 ~ createdb --encoding UNICODE mydb1 --username postgres
 ~ pg_restore /tmp/mydb1.tar |  psql --dbname mydb1 --username postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydb1
 ~ createdb --encoding UNICODE mydb2 --username postgres
 ~ pg_restore /tmp/mydb2.tar |  psql --dbname mydb2 --username postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydb2
 ~ createdb --encoding UNICODE mydb3 --username postgres
 ~ pg_restore /tmp/mydb3.tar |  psql --dbname mydb3 --username postgres
 ~ vacuumdb --full --analyze --username postgres --dbname mydb3


Cross-post from my blog Summer code

The post is similar to the recently published Patchim UTF-8 Collation under FreeBSD , but it seems to me that the FreeBSD-specific solution is described there, but I cite for Ubuntu. When I solved my problem I didn’t even use this information - only google-reading.

Thanks for the karma! , transferred to the PostgreSQL blog.

Source: https://habr.com/ru/post/55620/


All Articles