📜 ⬆️ ⬇️

Features of use of MongoDB



A little more than a year ago I was asked to participate in the development of one well-known in narrow circles (but not always from the good side) social network. At that time I was already a fan of the Haxe language, so with that I didn’t have any questions to write about. But from the database they appeared. The experience of using MS SQL Server and MySQL said that when it comes to large amounts of information, difficulties sometimes occur (changing the structure of the database becomes almost impossible, and once-fast queries are critically slow). After consulting with colleagues (who already had experience with Mongo), we decided to use this particular DBMS. And about those features that have surfaced during this year, I will discuss below.

Job stability


After a few months of use, an interesting problem appeared. It was like this: I started building an index for one of the collections, then, upon finding that the index was incorrect, I allowed myself to remove the process via db.killOp () . After that, the index formally appeared, but when trying to use ANY index (except for natural ) in this collection, the DBMS returned an internal error (some kind of assert worked in its code). Since the collection contained almost 100 million records, requests to it became impossible and we had to stop the service for several days while we were transferring data to a new collection. And the old collection still lives in our database - it cannot be removed by regular means.
As a result, we now have all the collections related to one service, put in a separate instance of the DBMS, which in a critical situation can be stopped and corrected, by sacrificing the availability of only one service.

Backup copies


A lot of time was spent on deciding how to back up information. The first method that comes to mind - creating a set of replicas - seemed little appropriate in the sense that sometimes you need not the last copy of the database at the time of the crash, but an earlier one. In addition, when removing the backup, it would be nice not to create an additional load on production. As a result, came to the use of the master-slave configuration, in which the slave is backed up by regular means from time to time.
Hint: to avoid security settings through certificates, you can simply create a user “repl” in the local master and slave DB instances:
')
use local db.addUser("repl", "mypassword") 


Unique indexes


Accustomed to the relative safety of SQL, you can get into trouble. We had such an incident: for one of the collections we launched the construction of a unique index for two fields, one of which was absent from a large number of records, and the other was repeated regularly. As a result of our error, the DBMS began to delete data from the collection. Having understood this in time, we removed the deletion process by db.killOp () in a few seconds, and the information was later restored. But the sediment, as they say, remained. Therefore, as written in the manual: use dropDups: true with caution!

Counters


The operations of counting the number of elements matched by a given filter on large amounts of data can take a very long time when compared to SQL. The fact is that Mongo does not store the number of subnodes in the index trees. As a result, even if your query completely falls under the index, the DBMS is forced to bypass all entries suitable for the filter (and in fact there are often a lot of them). So be prepared to store the counters separately and change them when necessary. Or to have background services that will recalculate the values ​​of the counters with a specified periodicity. We use both methods, depending on the situation (the first is in simple cases, such as the number of comments for a picture, and the second for complicated ones when you need to store the number of records in a collection that fall under different sets of conditions).

Log rotation


Do not forget about such a “trifle” as the size of the log-file. After all, by default, Mongo writes to it all operations with the base. Therefore, on a production system, it can quickly eat up all the space on the section. On non-critical instances (such as those intended for backing up a slave), it seems to me that it is better to disable it altogether. On the other - to organize a rotation.

Security


Without special settings, it can easily turn out that Mongo will look on the Internet without a password. After the initial configuration, do not forget to create a user administrator:

 use admin db.addUser("root", "mypassword") 


In addition, remember that this DBMS may also see the http-page (see the nohttpinterface option) with statistics in the network.

Full text search


In general, it works. It should only be remembered that:

  1. Languages ​​(in particular, Russian) are supported not entirely correctly - Mongo, for example, uses a number of heuristic rules for detecting endings, but, apparently, does not store lists of exceptions or the like, so do not expect anything particularly clever from such a search;
  2. if a lot of entries match a search query (you searched for a common word) - it will be bad; Namely: requests will be executed for a long time (in our case - about 30 seconds or more) and it is difficult to somehow influence it without introducing assumptions (we will make limit before the search - some data will not find - well, okay).
    Therefore, although we use Mongo full-text search, but only because it comes out of the box. If quality search is critical - use something like sphinx.


findings


Overall, I was pleased with our choice: the horizontal scalability and flexibility of the document-oriented model outweigh the drawbacks.
Now our social network operates on several dozens of servers, despite the fact that the services made by our team are well maintained on one machine for server code (the code itself is written in haxe using the HaQuery web framework and compiled in neko ), two machines for production- DB and one for Mongo slave instances.
The version of MongoDB used is 2.4.5.

I hope if you are thinking about using MongoDB for your project, this article will help you make the right decision.

Source: https://habr.com/ru/post/229129/


All Articles