Not so long ago, there was a need to independently evaluate the performance and resource intensity of the increasingly popular noSQL
MongoDB DBMS. For clarity, I decided at the same time to compare it with the performance of PostgreSQL, which is also well-known and actively used.
Habitat
Tests were performed on a virtual server with the following characteristics:
cpu: 4 cores at 2GHz
RAM: 2GB
OS: Centos 6.4
mongoDB
Version: 2.4.3
')
Configuration:logpath=/xxx/mongod.log logappend=true fork = true dbpath=/xxx/mongo pidfilepath = /xxx/mongod.pid
Parameters when starting mongod: OPTIONS=" -f $CONFIGFILE"
PostgreSQL:
Version: 8.4.13
Configuration: shared_buffers = 1GB max_prepared_transactions = 0 work_mem = 64MB maintenance_work_mem = 512MB effective_cache_size = 512MB
Experimental
It was decided to create and add exactly
1,000,000 records to the test collection (and for PostreSQL a table). The algorithm for generating records was the same for MongoDB as for PostreSQL. For a closer approach to "reality", an element of chance was introduced. As a result, the test object had the following structure:
- id is a random integer in the range from 1,000 to 10,000,000
- title - random string (Cyrillic) with a length of 10 to 50 characters
- text - randomly generated text with a length from 1,000 to 2,000 characters
- floatvalue is a random fractional number ranging from 1,000 to 10,000,000
To average the insert time, a million records were split into 4 parts of 250,000 records. At the first stage of the experiment, no indices were used.
The time measurements for the insert gave the following results:
MongoDB(The native JS was used for the generation. The generation time is included in the total insertion time) Insert 250 000 records complete! Total time: 62.788 sec Insert 250 000 records complete! Total time: 62.481 sec Insert 250 000 records complete! Total time: 62.916 sec Insert 250 000 records complete! Total time: 61.565 sec Average time: 62.4375 sec
PostgreSQL(PHP was used for generation. Generation time is not included in the total insert time) psql -d prefTest -f 250k.p1.sql (Total time: 326.377 sec) psql -d prefTest -f 250k.p2.sql (Total time: 326.646 sec) psql -d prefTest -f 250k.p3.sql (Total time: 327.726 sec) psql -d prefTest -f 250k.p4.sql (Total time: 327.039 sec) Average time: 326.947 sec
Experiment I: No index
For the experiment the following requests were taken:
MongoDBdb.tmp.find({id:{$gt:10000}}) /* */
db.tmp.find({floatvalue: {$lt:300000}}) /* */
db.tmp.find({title:/^/}).explain() /* «» */
db.tmp.find({text://}).explain() /* «» */
PostgreSQL (all the same, but in SQL syntax)
select * from tmp where id>10000
select * from tmp where floatvalue<300000
select * from tmp where title like '%'
select * from tmp where text like '%%'
In the course of the experiment, for each request, not only the execution time of requests was measured, but also the average utilization of resources (processor load and RAM).
For a more visual comparison of the speed of operations, the
speed attribute (the number of processed objects / records per second) was entered.
The result of comparing the speed of operations in a table / collection is shown in the diagram:

Here you can see that MongoDB achieves the maximum performance difference due to the insertion speed (
4004 objects per second
versus 765 for PostgreSQL). And the PostgreSQL did not cope with the sampling operation by the
id parameter, all attempts led to
process terminated
(raising the parameters of
shared_buffers , etc. did not lead to anything, I didn’t dig deeper, because it doesn’t matter).
Naturally, this detachment of MongoDB is also due to more active use of CPU time. Average numbers for all
sample queries :
MongoDB- 30% - 40% of CPU load
- 704 - 706 MB of used RAM
While
PostgreSQL- 5% - 25% load on the processor
- 550 - 1000 MB of used RAM
If you average, the use of RAM between these databases is not much difference when sampling. However, when inserting MongoDB, again, it uses it much more actively (
1407 MB , against
745 MB for PostgreSQL).
But, as we remember, these are all results for queries that do not use an index. In the real world, it is difficult to find developers who do not optimize their requests with one or another index. Therefore, the second stage of our experiment was the addition of several indexes to the MongoDB collection and the PostgreSQL table. His results were very ... mm ... unpredictable. I will definitely tell about them in the next part:
MongoDB vs PostgreSQL performance comparison. Part II: Index , as well as summarize the overall results of the test with the resulting plates and graphs.