📜 ⬆️ ⬇️

MongoDB: Range Query Performance

If you traveled around the territory of MongoDB indices, you may have heard the principle: If your queries contain sorting, then add the sorted field to the end of the index that is used in these queries.

In many cases, when queries contain equality conditions like {“name”: “Charlie”}, the principle above is very useful. But what about him can be said with the following example:

: db.drivers.find({"country": {"$in": ["A", "G"]}).sort({"carsOwned": 1}) : {"country": 1, "carsOwned": 1} 

This bundle is not effective, although the principle is respected. Because there is a trap into which this principle can lead you.
Below we consider the reasons for the emergence of this trap and by the end of the article you will have a new rule that will help you with indexing.

Let's recall the basics from the MongoDB documentation:
* "Indices early"
Indexes deserve consideration at the beginning of the design. Historical, efficiency at the data access level was passed on to the database administrators, this created an optimization layer after the design.
With document-oriented databases, it is possible to avoid this.
* "Indices often"
Indexed queries work better by several orders of magnitude, even on small data. While a query without a index can take 10 seconds, the same query can take 0 milliseconds with the corresponding index.
* "Indices completely"
Queries use indices from left to right. An index can be used only if the query uses all fields in the index without gaps.
* "Sort Index"
If your query will contain a sort, then add the sorted field to your index.
* "Teams"
.explain () will show which index is used for this query.
.ensureIndex () creates indexes.
.getIndexes () and .getIndexKeys () will show which indexes you have.
')
Now back to our question. Based on the basics of indexing, for the following query:
 db.collection.find({"country": "A"}).sort({"carsOwned": 1}) 

We need to create this index:
 db.collection.ensureIndex({"country": 1, "carsOwned": 1}) 

What if most conditions in a condition use range selection instead of comparison? Like this:
 db.collection.find({"country": {"$in": ["A", "G"]}}).sort({"carsOwned": 1}) 

Here we used the $ in operator, but besides it, there are also such as: $ gt, $ lt, etc.
If you use such a query, you will see that it is not effective, while you remember the basics - you need to run .explain () and see which index is used and how.
As a result of executing .explain (), you will see {scanAndOrder: true}, which means MongoDB performs sorting operations, and this is an expensive operation since MongoDB sorts documents in memory. Therefore, you should avoid large data sets because it is slow and resource intensive.

Don't forget why scanAndOrder is slow, why does MongoDB sort the result even though we already have an index with sorting? The answer is simple: we do not have a suitable index.

Why? The reason is simple, the point is in the index structure we created. For the example above, documents having {“country”: “A”} and documents having {“country”: “G”} are sorted in the indicator by {“carsOwned”: 1},
but they are sorted independently of each other. They are not sorted together! Consider the chart below:



The left diagram shows the procedure for crawling documents by the index that we created. After all documents are found, they will need to be sorted.
In the right diagram, the alternative index {“carsOwned”: 1, “country”: 1}. In this case, the documents found will be already in sorted form.
This subtle point of efficiency led to the following rules when indexing:

The order of the fields should be:
1. First, the fields that are selected by exact values.
2. Next, the fields for which will go sorting.
3. And at the end of the field for the range filter.

Ending
Is there a compromise? Yes. The request will be visited by several index nodes, which is technical, because traversing the sorted part will occur before filtering.
Thus, the new rule is a pure approximation for many queries, but do not forget that the complexity of your data can lead to different results.

I hope this guide will help you. Good luck.

Source: https://habr.com/ru/post/147053/


All Articles