Low search engine speed
Working on the social information search engine (
ark.com ), we opted for Elasticsearch, since it was very easy to configure and use, it had excellent search capabilities and, in general, looked like manna from heaven. And so it was, until our index grew to more or less decent size of ~ 1 billion documents, the size with the replicas already exceeded 1.5 TB.
Even a banal
Term query
could take tens of seconds. Documentation on ES is not as much as we would like, and googling of this issue gave out the results of 2 years ago on completely irrelevant versions of our search engine (we work from 0.90.13 - which is also not quite an old thing, but we cannot afford omit the entire cluster, update it, and restart it at the current moment - only rolling restarts).
Low indexing speed
')
The second problem is that we index more documents per second (about 100k) than Elasticsearch can handle. Timeouts, huge load on Write IO, queues from processes of 400 units. Everything looks very scary when you look at it in Marvel.
How to solve these problems - under the cut
Scale Elasticsearch Cluster
Initial situation:- 5 data nodes, http enabled:
- 100 GB RAM
- 16 cores
- 4 TB HDD (7200 RPM, seagate)
- Indices:
- from 500 to 1 billion documents, only 5 pieces
- the number of primary shards is from 50 to 400 (here we tested different indexing strategies - this setting is very important)
- replicas - from 2 to 5
- index size up to 1.5 terabytes
Increase indexing speed in ElasticsearchThis problem was not so complicated, and there is a little more information on the Internet about it.
Checklist to check:
refresh_interval
- how often the search data is updated, the more often, the more Write IO you needindex.translog.flush_threshold_ops
- after how many operations to dump data to diskindex.translog.flush_threshold_size
- how much data should be added to the index before dumping to disk
Detailed documentation here:
www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.htmlFirst of all, we increased the refresh_interval to 30 seconds, and actually increased the bandwidth to almost 5,000 documents per second. Later they set flush_threshold_ops in 5000 operations, and the size is up to 500 mb. If you want, you can play around with the number of replicas, shards, and so on, but this will not make so much difference. Also pay attention to the threadpool, if you need to increase the number of parallel requests to the database, although most often this is not required.
Increasing the speed of requests in ElasticsearchNow go to the difficult part. Knowing the size of our index and the constant needs to restart the cluster (version updates, maintenans machines), and also taking into account posts like this:
gibrown.wordpress.com/2014/02/06/scaling-elasticsearch-part-2-indexing we decided that the size of the shard in our index will not exceed 1-2 GB. With RF3, our index (we count on 1.5 billion documents), considering that 0.5 billion of our documents occupy about 300 GB excluding replicas, we created
400 shards in the index and decided that everything will be fine - the reboot speed will be enough high: we will not need to read data blocks of 50-60 GB, as well as replicate them, thus blocking the recovery of small shards, and the search speed for small shards is higher.
At first, the number of documents in the index was small (100–200 million) and the query speed was only 100–200 ms. But as soon as almost all shards were filled with at least a small number of documents, we began to lose significantly in query performance. Combining all this with a high load on IO due to constant indexing, we could not do it at all.
In this case, we made 2 mistakes:
1. Created a lot of shards (ideal situation 1 core - 1 shard)
2. Our date nodes were also balancers with http enabled - serialization and deserialization of data takes a lot of time
Therefore, we began to experiment.
Add balancers to ElaticsearchThe first and obvious step for us was the addition of the so-called
balancer nodes
in Elasticsearch. They can aggregate the results of queries on other shards, they will never be overloaded with IO, since they do not read and write to the disk, and we will unload our data nodes.
For deployment we use chef and the corresponding elasticsearch cookbook, therefore creating only a couple of additional roles with the following settings:
name "elasticsearch-balancer" description "Installs and launches elasticsearch" default_attributes( "elasticsearch" => { "node" => { "master" => false, "data" => false } } ) run_list("services::elasticsearch")
We safely launched 4 balancers. The picture improved a little - we no longer observed overloaded nodes with steaming hard drives, but the query speed was still low.
Increasing the number of data nodes in ElasticsearchNow we remembered that the number of shards that we had (400) does not in any way affect the performance improvement, but only aggravates it, since too many shards are on one machine. After simple calculations, we get that 5 machines will adequately support only 80 shards. Given the number of replicas, then we have them in general 1200.
Since our total fleet of cars (80 nodes) allows the addition of a sufficiently large number of nodes and the main problem in them is the size of the HDD (128GB in total), we decided to add about 15 machines at once. This will allow you to work with another 240 shards more efficiently.
In addition, we stumbled upon a few interesting settings:
*
index.store.type
- by default it is set to niofs, and on benchmarks performance is lower than that of mmapfs - we switched it to mmapfs (default value is 1.x)
*
indices.memory.index_buffer_size
- increased to 30%, and the amount of RAM for Java Heap, on the contrary, was reduced to 30 GB (it was 50%), since with mmapfs we need much more RAM for the operating system cache
And of course, in our case it was necessary to enable the setting of control over the location of shards on the basis of free space:
curl -XPUT localhost:9200/_cluster/settings -d '{ "transient" : { "cluster.routing.allocation.disk.threshold_enabled" : true } }'
After a couple of days of transferring the shards and restarting the old servers with the new settings, we conducted tests and
not cached queries (Term Query, not filters) were performed for no more than 500 ms. This situation is still not perfect, but we see that adding a date node and adjusting the number of cores to the number of shards corrects the situation.
What else should be considered when scaling a clusterWhen rolling cluster restarting, be sure to turn off the ability to transfer shards:
cluster.routing.allocation.enable = none
, in older versions there is a slightly different setting.
If you have any questions during the reading - I will be glad to discuss.