Greetings habrayuzer!

It was evening, there was nothing to do, and then I remembered - I wanted to share my recent combat experience with the community.
I had a task - to automate the backup procedure and create a procedure to restore the
Graylog server.
The server was unknown to me, there was no previous encounter.
Well, I sat and read, I thought - nothing complicated. However, Google searches have shown that it is not every day that such a task appears, because there was practically no information.
“Where is ours not disappeared?” - I thought, everything should be extremely simple, copy the configuration files and voila.
I will make a small digression to describe the Graylog-server and its components.
What is a graylog server?
')
Graylog2 is an open-source system for collecting and analyzing statistics, allowing you to process data quite flexibly. As an agent, it uses
syslog . Data is sent from the nodes via syslog and aggregated by the Graylog server.
As a database for storing content and settings -
MongoDB is used.
Well, the most bulky part of the server is
ElasticSearch , a powerful tool for indexing and searching data.
Backup process
The assignment began to take shape. It was necessary to copy the contents of the
MongoDB and
ElasticSearch indexes, as well as the
configuration files of each part of
Graylog .
Having stopped the graylog-server and
elasticsearch service
beforehand , I proceeded with the backup.
/etc/init.d/graylog-server stop /etc/init.d/elasticsearch stop /etc/init.d/chef-client stop
In my case, in MongoDB we had a base called
graylog2 . In order to get a copy of it, I created a
dump database with the following command:
logger -s -i "Dumping MongoDB" mkdir -p path-to-backup mongodump -h 127.0.0.1 -d graylog2 -o path-to-backup/
Thus, in the path-to-backup directory, a dump of the “graylog2” database located on localhost is created (you can also specify a remote node).
The next step is to back up and compress ElasticSearch indexes. In our case, for 7 months of work, about 12 GB of indexes gathered. By default, their compression was not configured, which could reduce the cost of storage space at times.
The directory that stores the indexes, in our case, was located on the mounted partition. Parameter
path.data in
/etc/elasticsearch/elasticsearch.yml is responsible for indicating the location of the indexes. Also, an important parameter (without it will not work, in any way) is the name of the cluster, specified in the same configuration file by the parameter
cluster.name .
To back up the indexes, I used the following command, which compressed and packed the contents of the index directory:
logger -s -i "Dumping MongoDB" tar -zcf path-to-backup/elasticsearch.tar.gz --directory=path-to-indices graylog2
As a result, out of 12 GB of original information, the archive was 1.8 GB. Well, not bad already ...
Next, it remained to copy the configuration files Graylog, MongoDB and ElasticSearch. It should be noted that the ElasticSeach configuration file -
elasticsearch.yml - also contained the
node.name parameter, which is the hostname of our server. This is important if the recovery of the Graylog server occurs on a node with a different
hostname . Similarly, the contents of the Graylog configuration file -
graylog2.conf - contained settings for our specific MongoDB database, which was used for user access and password.
I mention this all to the fact that thoughtless copying of the configuration files will not lead to good, and this is “not our methods, Shurik” (c)
After all the configuration files were packed and copied, it remains to transfer these files to the backup server. Here, in fact, everyone is free to do as he wants and as required by the infrastructure.
In my case, copying was done using
scp using the authentication key:
logger -s -i "Copying backups to Backup server" scp -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -r -i /root/.ssh/id_rsa path-to-backup backup-user@backup-server:
logger -s -i "Copying backups to Backup server: DONE"
Summing up the backup process, I would like to highlight the steps to be taken:
- Stop Graylog and ElasticSearch services
- Create dump-a (copy) MongoDB database
- Copying and archiving the index directory of ElasticSearch
- Copying configuration files
Graylog server recovery process
Not surprisingly, the recovery process is a mirror image of the backup process.
Below I give a small
bash- script that restores the Graylog server:
/etc/init.d/graylog-server stop /etc/init.d/elasticsearch stop scp -r user@backup-server/graylog-backup/* ./ tar zxf graylog2-mongodump.tar.gz tar zxf elasticsearch.tar.gz mongorestore -d graylog2 ./graylog2 mv ./elasticsearch/* /opt/elasticsearch/data/ mv ./graylog2.conf /etc/ mv ./elasticsearch.yaml /etc/elasticsearch/elasticsearch.yml /etc/init.d/graylog-server start /etc/init.d/elasticsearch start
The script copies the archives from the backup-server, unpacks them, then the
graylog2 database is
restored to MongoDB and the ElasticSearch indexes are moved to the default directory. Also the configuration files of ElasticSearch and Graylog-server are copied. After that, the ElasticSearch service and the Graylog-server are started.
In order to verify the integrity of the recovery, you can do the following:
- go to the server web-interface and make sure that all Messages, Hosts, Streams and parameters are in identical state
- compare curl query result from curl -XGET " localhost : 9200 / graylog2_0 / _mapping
The process is simple, tested on multiple instances. However, little-documented. It is also worth noting that with the release of ElasticSearch v.1 - it is simplified by the introduction of the procedure for obtaining "impressions" of indices, but this does not change the essence.
I hope that this article will help someone. Thanks for attention.
PS Special thanks to my colleague
Siah , who made this script beautiful and amenable to automation. Well, I'm a lazy topstarter :)