How to return a remote config or Never give up!

Sysadmins are divided into those who do not make backups, and those who already do them =)

Not one article was written about how to restore files from ext3 / ufs, so I won’t repeat and write about not the most widely known ways to restore configs to the production server.

How did this happen?

Call in the evening from an old friend who is now working in a web studio. On the other side is complete panic and uncertainty.
-! "№;%: Aaaa! Everything has fallen, nothing is not working. I kapets, save.
')
After fifteen minutes of bringing a person into an adequate state and finding out what did happen, the following became clear:

Their studio not only makes websites, but also hosts them.
The nginx config is generated by the script pulling out locations and rewrites from the MySQL service database.
The base is on good servers with RAID-1 and master-slave replication
Backups are not made, since “the chance that both screws die on both servers is zero” (c) The sysadmin of this studio

About backups

What is true is true. Indeed, 4 screws cannot die at the same time (possible, but statistically unlikely © “Charlie” Eppes, Numb3rs ), however, for some reason, people do not think that rm -rf / * performed on RAID-1 will kill the info on both screws, they also forget that DROP TABLE is replicated from one server to another. Also, rarely anyone suspects that one day the office could burn out due to fire / drown due to flooding / collapse due to an earthquake / leave with OBEP. In general, off-site backups are generally very few people do ... And in vain, at least once a month, you can merge everything onto a flash drive into a password-protected .rar and take it home even manually, without steaming.

Neither ZFS snapshots, nor RAID, nor replication are a replacement for backups. Although all this reduces the chances of losing data, and it’s very good what it is, however, off-site backups should always be there!

Closer to the point

According to Murphy's Law , what can happen, just has to happen. So on this ill-fated evening because of an error in the UPDATE SQL query, the service table with the data from which the nginx config was generated was filled with '', and because of an error in the script, nginx.conf was overwritten with an empty file. Fortunately, nginx is a smart thing and before reloading the config checks it for correctness, so nginx refused to use the new config.

How to recover the rewritten config?

My old friend gave me access to the frontend with nginx.
It's all ordinary: The machine on FreeBSD, gmirror for two disks and nginx, nothing more.
First, I stopped gmirror so that all my changes do not overwrite the files on the second screw. Then I began to think about how to restore the killed file from disk, but then looking at the server uptime and remembering that a friend had said, they say, the configuration changes quite rarely, I decided to try another method.

Looked at how much we have swap.

# swapinfo
Device 1K-blocks Used Avail Capacity
/dev/ad4s1b 2063152 94612 1968540 5%

The fact that he is currently busy at 5% does not mean that there is only 5% of information, most likely there it is much more =)

Keep it current
# cat /dev/ad4s1b > /usr/SWAP

And knowing what the thread from the config line will begin to ram on it. Since most people tyunyut, both fryuha, and nginx "by Sysoev" , then in the config most likely there is a line "reset_timedout_connection on", well, let's check my luck and try to bind it:

# cat /usr/SWAP | grep -a -A10 reset_timedout_connection
ǈ ǈ ǈ ǈ ǈ ǈ$ ǈ0 ǈ8 ǈ< ǈX ǈ\ ǈd ǈp ǈ ǈ ǈ ǈ ǈ ǈ ǈ ǈ ǈ ǈ ǈ8 ǈP ǈp ǈ ǈ ǈ ǈ ǈ ǈX ǈ
ǈ ǈ m [Ȉh ǈxȈҰǈ@ . ` ` 0u 0u2 d d ǈ Ȉ<4 @TȈ Ȉ
--
reset_timedout_connection on;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
send_lowat 12000;

keepalive_timeout 65;

gzip on;
gzip_min_length 2048;
gzip_types text/css text/js text/xml;
^C

and here, voila, a piece of the config, it remains only to play with the values -A and -B, to extract the config entirely and choose from the options the most new / unbeaten (maybe there will be several of them in the swap)

# cat /usr/SWAP | grep -a -A400 -B12 "reset_timedout_connection on;"

All config in our hands. It seems the sales are not broken and up-to-date. Now you can restore it to MySQL table.

This method is not a panacea or a silver bullet, the fact that it worked in my case is an exception rather than the rule, but maybe this method will help some of you to recover important data.

If there is no swap, and the screw file cannot be restored

There is also a second, less preferred option for restoring information if the server is still running the nginx process

To begin with, we are looking for a nginx master

# ps -auxww | grep nginx
root 1197 0,0 0,1 13216 2488 ?? Is 18 0:00,02 nginx: master process /usr/local/sbin/nginx
www 29484 0,0 2,3 57248 47576 ?? I 7:58 0:00,06 nginx: worker process (nginx)

Next we do it with coredump
# gcore 1197
And then we pick it as we want, even though
# cat core.1197 | strings | grep -B10 -A10 reset_timedout_connection
even so
# cat core.1197 | grep -a -B10 -A10 reset_timedout_connection
... And we are terrified of how difficult it is to assemble the config bit by bit

Conclusion

People, do not be Yourselves Angry Buratinos, make frequent well-protected automatic backups of data. And remember that even from the deepest ass there are at least two ways out%)

Instead of epilogue

MySQL database was eventually restored. The admin himself, without knowing it, turned on --bin-log from the very beginning of the base's life (by the way, by the time I began to restore the database, the binlog already occupied 89% / var and after a couple of months, mysql would stop running). Due to the fact that no one deleted them, Point-in-Time Recovery could be done.

PS It would be nice if nginx could, upon request, issue its current config or diff from the current one and what lies in the file on disk =)

Source: https://habr.com/ru/post/68185/

All Articles