Backup and Restore Graylog Server

Greetings habrayuzer!

It was evening, there was nothing to do, and then I remembered - I wanted to share my recent combat experience with the community.
I had a task - to automate the backup procedure and create a procedure to restore the Graylog server.

The server was unknown to me, there was no previous encounter.
Well, I sat and read, I thought - nothing complicated. However, Google searches have shown that it is not every day that such a task appears, because there was practically no information.
“Where is ours not disappeared?” - I thought, everything should be extremely simple, copy the configuration files and voila.
I will make a small digression to describe the Graylog-server and its components.

What is a graylog server?

')
Graylog2 is an open-source system for collecting and analyzing statistics, allowing you to process data quite flexibly. As an agent, it uses syslog . Data is sent from the nodes via syslog and aggregated by the Graylog server.
As a database for storing content and settings - MongoDB is used.
Well, the most bulky part of the server is ElasticSearch , a powerful tool for indexing and searching data.

Backup process

The assignment began to take shape. It was necessary to copy the contents of the MongoDB and ElasticSearch indexes, as well as the configuration files of each part of Graylog .
Having stopped the graylog-server and elasticsearch service beforehand , I proceeded with the backup.

/etc/init.d/graylog-server stop /etc/init.d/elasticsearch stop /etc/init.d/chef-client stop

In my case, in MongoDB we had a base called graylog2 . In order to get a copy of it, I created a dump database with the following command:

 logger -s -i "Dumping MongoDB" mkdir -p path-to-backup mongodump -h 127.0.0.1 -d graylog2 -o path-to-backup/

Thus, in the path-to-backup directory, a dump of the “graylog2” database located on localhost is created (you can also specify a remote node).

The next step is to back up and compress ElasticSearch indexes. In our case, for 7 months of work, about 12 GB of indexes gathered. By default, their compression was not configured, which could reduce the cost of storage space at times.
The directory that stores the indexes, in our case, was located on the mounted partition. Parameter path.data in /etc/elasticsearch/elasticsearch.yml is responsible for indicating the location of the indexes. Also, an important parameter (without it will not work, in any way) is the name of the cluster, specified in the same configuration file by the parameter cluster.name .
To back up the indexes, I used the following command, which compressed and packed the contents of the index directory:

 logger -s -i "Dumping MongoDB" tar -zcf path-to-backup/elasticsearch.tar.gz --directory=path-to-indices graylog2

As a result, out of 12 GB of original information, the archive was 1.8 GB. Well, not bad already ...

Next, it remained to copy the configuration files Graylog, MongoDB and ElasticSearch. It should be noted that the ElasticSeach configuration file - elasticsearch.yml - also contained the node.name parameter, which is the hostname of our server. This is important if the recovery of the Graylog server occurs on a node with a different hostname . Similarly, the contents of the Graylog configuration file - graylog2.conf - contained settings for our specific MongoDB database, which was used for user access and password.
I mention this all to the fact that thoughtless copying of the configuration files will not lead to good, and this is “not our methods, Shurik” (c)

After all the configuration files were packed and copied, it remains to transfer these files to the backup server. Here, in fact, everyone is free to do as he wants and as required by the infrastructure.

In my case, copying was done using scp using the authentication key:

 logger -s -i "Copying backups to Backup server" scp -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -r -i /root/.ssh/id_rsa path-to-backup backup-user@backup-server:

logger -s -i "Copying backups to Backup server: DONE"

Summing up the backup process, I would like to highlight the steps to be taken:

Stop Graylog and ElasticSearch services
Create dump-a (copy) MongoDB database
Copying and archiving the index directory of ElasticSearch
Copying configuration files

Graylog server recovery process

Not surprisingly, the recovery process is a mirror image of the backup process.
Below I give a small bash- script that restores the Graylog server:

 /etc/init.d/graylog-server stop /etc/init.d/elasticsearch stop scp -r user@backup-server/graylog-backup/* ./ tar zxf graylog2-mongodump.tar.gz tar zxf elasticsearch.tar.gz mongorestore -d graylog2 ./graylog2 mv ./elasticsearch/* /opt/elasticsearch/data/ mv ./graylog2.conf /etc/ mv ./elasticsearch.yaml /etc/elasticsearch/elasticsearch.yml /etc/init.d/graylog-server start /etc/init.d/elasticsearch start

The script copies the archives from the backup-server, unpacks them, then the graylog2 database is restored to MongoDB and the ElasticSearch indexes are moved to the default directory. Also the configuration files of ElasticSearch and Graylog-server are copied. After that, the ElasticSearch service and the Graylog-server are started.

In order to verify the integrity of the recovery, you can do the following:

go to the server web-interface and make sure that all Messages, Hosts, Streams and parameters are in identical state
compare curl query result from curl -XGET " localhost : 9200 / graylog2_0 / _mapping

The process is simple, tested on multiple instances. However, little-documented. It is also worth noting that with the release of ElasticSearch v.1 - it is simplified by the introduction of the procedure for obtaining "impressions" of indices, but this does not change the essence.
I hope that this article will help someone. Thanks for attention.

PS Special thanks to my colleague Siah , who made this script beautiful and amenable to automation. Well, I'm a lazy topstarter :)

Source: https://habr.com/ru/post/213075/

All Articles

Backup and Restore Graylog Server

What is a graylog server?

Backup process

Graylog server recovery process

More articles: