📜 ⬆️ ⬇️

MonCaché - implementation of the MongoDB API based on InterSystems Caché


IDEA


The idea of ​​the project is to try to implement the basic functions of the MongoDB API to search, save, update and delete documents so that you can use InterSystems Caché without changing the client code instead of MongoDB.

MOTIVATION


Perhaps if you take the MongoDB interface and use InterSystems Caché as a data warehouse, you can get some performance gain.

Well, why not ?! ¯ \ _ (ツ) _ / ¯

LIMITATIONS


Within the framework of the research project, several simplifications were made:
- only primitive data types are used :
- null, boolean, number, string, array, object, ObjectId;
- client code works with MongoDB through the MongoDB driver;
- client code uses MongoDB Node.js driver;
- client code uses only the basic MongoDB API functions:
- find , findOne - search for documents;
- save , insert - save documents;
- update - update documents;
- remove - delete documents;
- count - counting documents.
')

REALIZATION


As a result, the task was divided into the following subtasks:
- play the MongoDB Node.js driver interface using the selected basic functions;
- implement this interface using as a data warehouse - InterSystems Caché:
- develop a database presentation scheme in Caché;
- develop a scheme for the presentation of collections in Caché;
- develop a scheme for the submission of documents in Caché;
- develop a scheme for interaction with Caché using Node.js;
- implement the developed schemes and test a little . :)

IMPLEMENTATION DETAILS


There were no special difficulties with the first subtask, so I will go straight to the subtask of the implementation of the interface.

MongoDB defines the database as a physical container for collections . And the collection as a set of documents . And finally, the document as a data set. The document is similar to a JSON document, but with a large number of valid types - BSON .

In InterSystems Caché, all data is stored in globals . Simply, you can think of globala as hierarchical data structures.

In this project, all data will be stored in one global - ^ MonCache .

Thus, it is required to develop a diagram of the presentation of the database, collections and documents using hierarchical data structures.

Database representation scheme in Caché

In MongoDB, there can be several databases on one instance, which means you need to develop a presentation scheme that allows you to store several databases that would be isolated from each other. It is also important to note that MongoDB supports non-collection databases (hereinafter referred to as “empty” databases).

I chose the simplest and most obvious way to solve the problem. Databases are represented by the first level node in the global ^ MonCache. In addition, the value of "" is assigned to such a node in order to implement support for "empty" databases. The thing is that if you do not do this and simply add child nodes, then as soon as all child nodes are deleted, the parent node will also be removed (features of globals).

Total, each database is represented in Caché as follows:

^MonCache(<db>) = "" 

For example, the database representation "my_database" would be:

 ^MonCache("my_database") = "" 

Caché Collections Presentation Scheme

MongoDB defines a collection as a database item. All collections in one database have a unique name, which means the name can be used to uniquely identify the collection. This fact allowed me to find an easy way to present collections in the global, namely to use second-level nodes. Now we need to solve two small problems. The first is that, like databases, collections can also be empty. The second is that the collection is a collection of documents. And all documents must be isolated from each other. Honestly, I never thought of anything better than storing a counter, something like an auto-increment value, as the value of a collection node. All documents have their own unique number. When a new document is inserted into the collection, a node is created with the name equal to the current counter value, and then the counter value is increased by 1.

In total, each collection is represented in Caché as follows:

 ^MonCache(<db>) = "" ^MonCache(<db>, <collection>) = 0 

For example, the representation of the collection “my_collection” in the database “my_database” would be:

 ^MonCache("my_database") = "" ^MonCache("my_database", "my_collection") = 0 

Caché Document Submission Scheme

The document in this project is a JSON document extended by an additional type - ObjectId. It was necessary to develop a scheme for the presentation of documents on hierarchical data structures. Here I had a few surprises. Firstly, there is no way to use native null in Caché, since Caché does not support null. The second interesting point is that boolean values ​​are implemented with constants 0 and 1. That is, roughly speaking, true - 1, false - 0. The most expected problem point was what you need to think of how to store an ObjectId. In general, all these problems have been successfully solved in the most, as it seemed to me, simple form. Next, I will look at each data type and its presentation.

Presentation schemes
For a more concise record, I will use a special notation - @ .
Instead of ^ MonCache (<db>, <collection>, <document id>, ...) I'll just write
@ (...) .

Let there be a field f of the type “null”.

 f: null 

We define for it the following view:

 @("f", "t") = "null" 

Let there be a field f of the type “boolean” (true).

 f: true 

We define for it the following view:

 @("f", "t") = "boolean" @("f", "v") = 1 

Let there be a field f of the type “boolean” (value false).

 f: false 

We define for it the following view:

 @("f", "t") = "boolean" @("f", "v") = 0 

Let there be a field f of the type "number".

 f: 3.14 

We define for it the following view:

 @("f", "t") = "number" @("f", "v") = 3.14 

Let there be a field f of the type “string”.

 f: 'Habrahabr.ru' 

We define for it the following view:

 @("f", "t") = "string" @("f", "v") = "Habrahabr.ru" 

Let there be a field f of the type “ObjectId”.

 f: ObjectId('56b43c20af9c4f3fe2cc2908') 

We define for it the following view:

 @("f", "t") = "objectid" @("f", "v") = "56b43c20af9c4f3fe2cc2908" 

There are two types left: "object" and "array". By their nature, these types are “containers” for values ​​of more “simple” types. Therefore, you can simply recursively apply the rules already described and get views for the elements of these containers. The only subtle point is that you need to think of a way to preserve the order of the elements in an array container. This is solved trivially - all elements are numbered in the traversal order, and the presentation is made in the same order.

Suppose there is a field f of type “object” (empty).

 f: {} 

We define for it the following view:

 @("f", "t") = "object" 

Let there be a field f of the type “object”.

 f: { site: 'Habrahabr.ru', topic: 276391 } 

We define for it the following view:

 @("f", "t") = "object" @("f", "v", "site", "t") = "string" @("f", "v", "site", "v") = "Habrahabr.ru" @("f", "v", "topic", "t") = "number" @("f", "v", "topic", "v") = 276391 

Let there be a field f of the type “array” (empty).

 f: [] 

We define for it the following view:

 @("f", "t") = "array" 

Suppose there is a field f of the type “array”.

 f: [ 'Habrahabr.ru', 276391 ] 

We define for it the following view:

 @("f", "t") = "array" @("f", "v", 0, "t") = "string" @("f", "v", 0, "v") = "Habrahabr.ru" @("f", "v", 1, "t") = "number" @("f", "v", 1, "v") = 276391 


Caché Interaction Chart

The logical and simple choice of driver for working with InterSystems Caché was the choice of a Node.js driver (on the documentation website you can see other drivers for interacting with Caché). However, it is immediately worth noting that the driver capabilities were not enough I wanted to do several inserts and all this within one transaction. Therefore, it was decided to develop a set of Caché ObjectScript classes that were used to simulate the MongoDB API, but on the Caché side.

Caché Node.js driver did not know how to access classes in Caché, but he could make program calls in Caché. This fact led to the writing of a small program - a kind of bridge between the driver and classes in Caché.

As a result, the scheme looked as follows:


As part of the project, a special format NSNJSON (Not So Normal JSON) was developed, which allowed us to “drag” ObjectId, null, true, false through the driver in Caché. This format can be found on the corresponding page on GitHub - NSNJSON . I posted three articles on this format on Habrahabr:

- Complicated simplified JSON ;
- JSON for lovers of braces ;
- NSNJSON. 道 (Final article) .

OPPORTUNITIES MONCACHÉ


When executing a document search operation, the following criteria are supported:

- $ eq - equivalence;
- $ ne is not equivalent;
- $ not - negation of the criterion;
- $ lt - less than;
- $ gt - more than;
- $ exists - existence.

The following statements are supported during the update document operation:

- $ set - set value;
- $ inc - increment the value by the specified value;
- $ mul - multiplying the value by the specified value;
- $ unset - delete value;
- $ rename - rename value.

EXAMPLE


I took this code from the page of the official driver and redid it a bit.

 var insertDocuments = function(db, callback) { var collection = db.collection('documents'); collection.insertOne({ site: 'Habrahabr.ru', topic: 276391 }, function(err, result) { assert.equal(err, null); console.log("Inserted 1 document into the document collection"); callback(result); }); } var MongoClient = require('mongodb').MongoClient , assert = require('assert'); var url = 'mongodb://localhost:27017/myproject'; MongoClient.connect(url, function(err, db) { assert.equal(null, err); console.log("Connected correctly to server"); insertDocument(db, function() { db.close(); }); }); 

This code can be easily redone so that it works with MonCaché!
You just need to change the driver!

 // var MongoClient = require('mongodb').MongoClient var MongoClient = require('moncache-driver').MongoClient 

After executing this code, the global ^ MonCache will look like this:
 ^MonCache("myproject","documents")=1 ^MonCache("myproject","documents",1,"_id","t")="objectid" ^MonCache("myproject","documents",1,"_id","v")="b18cd934860c8b26be50ba34" ^MonCache("myproject","documents",1,"site","t")="string" ^MonCache("myproject","documents",1,"site","v")="Habrahabr.ru" ^MonCache("myproject","documents",1,"topic","t")="number" ^MonCache("myproject","documents",1,"topic","v")=267391 

DEMO


Among other things, a small demo application ( source ) was launched, also implemented on Node.js to demonstrate the driver change from MongoDB Node.js to MonCaché Node.js without restarting the server and changing the source code. The application is a tiny demonstration platform for performing CRUD operations on products and offices, as well as an interface for changing configurations (changing drivers).

The server allows you to create products and offices that are saved to the storage selected in the configuration (Caché or MongoDB).

The Orders tab displays a list of orders. I created the records, but the form did not finish, you can help the project ( source code ).

You can change the configuration by going to the "Configuration" page. On the page there are two buttons "MongoDB" and "MonCache". By clicking on the appropriate button you choose the configuration you need. When changing the configuration, the client application reconnects to the data source (an abstraction that separates the application from the driver actually used).

CONCLUSION


In conclusion, I will answer the main question. Yes! Indeed, it was possible to get some increase in the performance of the basic operations.

The MonCaché project is published on GitHub and is available under the MIT license.

BRIEF INSTRUCTIONS


  1. Install Caché
  2. Download all the necessary MonCaché components to Caché
  3. Create a MONCACHE area in Caché
  4. Create a moncache user in Caché with the password ehcacnom
  5. Create environment variable MONCACHE_USERNAME = moncache
  6. Create an environment variable MONCACHE_PASSWORD = ehcacnom
  7. Create an environment variable MONCACHE_NAMESPACE = MONCACHE
  8. Change your mongodb dependency on your project to moncache-driver
  9. Run your project! :-)

ACADEMIC PROGRAM INTERSYSTEMS


If you are interested in implementing your own research project on InterSystems technologies, then you can visit a specialized website dedicated to InterSystems academic programs .

Source: https://habr.com/ru/post/276391/


All Articles