📜 ⬆️ ⬇️

GlobalsDB is a universal NoSQL database. Part 2

Part 1.

We simulate 4 types of NoSQL databases using GlobalsDB

Before we start modeling different types of NoSQL databases, let's take a look at the globals in a little more detail and define some terms that we will use later.

When saving data in the global element, 3 components are used:
  • Global name
  • Indices (zero, one or more). They can be text or numeric.
  • Value (which, in fact, is stored in the global element). It m. text or numeric

These three components are often written as an N-ary relational variable as follows:

globalName[subscript1, subscript2, ..subscriptN] = value 

This combination of name, index, and value is known as a Global Node and is a storage unit. A global consists of many of its elements, and a database consists of many globals.
')
An important property of globals is that a single global can contain nodes with different numbers of indices, for example:

 myGlobal["a"] = 123 myGlobal["b", "c1"] = "foo" myGlobal["b", "c2"] = "foo2" myGlobal["d", "e1", "f1"] = "bar1" myGlobal["d", "e1", "f2"] = "bar2" myGlobal["d", "e2", "f1"] = "bar1" myGlobal["d", "e2", "f2"] = "bar2" myGlobal["d", "e2", "f3"] = "bar3" 

Ultimately, a single global is a sparse hierarchical tree of elements. For example, the global above describes the following hierarchical tree:



You can create as many globals with different names. In other words, the database in GlobalsDB consists of one or several globals, each of which represents a hierarchy of elements.

In such a database, there are no explicit links between different globals, but there may be hidden links that are defined and maintained at the application level.

Also within the database there is no explicit schema or data structure that is stored in globals. The way data is stored in globals is determined at the application level.

Items in globals are created with the set command. The exact syntax of this command depends on the API used.

So, for the Node.JS API, we can create the same global element in two ways.

Asynchronously:

 var globalNode = { global:"myGlobal", subscripts: ["b","c3"], data: "Rob" }; db.set(globalNode, function(error,results) { // etc }); 

or synchronously:

 var globalNode = { global:"myGlobal", subscripts: ["b","c3"], data: "Rob" }; db.set(globalNode); 

One of the interesting properties of GlobalsDB is that you can widely use synchronous programming without compromising performance, which simplifies working with it and allows you to use OO syntax in full inside JavaScript.

This is possible because of the unique performance of GlobalsDB: it works in conjunction with NodeJS as an authenticated process in RAM (unlike many other NoSQL databases that work through various network sockets), and in combination with deep optimization, It was announced, it gives performance like the bases in memory in RAM. The most popular NoSQL databases do not have this combination of properties.

Translator's note: However, if you need a network-based NoSQL-base, then it is easy to make it on the same Node.JS + GlobalsDB - just write an API based on JSON, for example.

Calling the above command will insert the element into our hierarchy and the tree will take on a new look:



Let us now, remembering the basic properties of globals, see how we can use them to represent typical data structures for storage of which NoSQL-bases are used.

1) Key / Value Storage

The implementation of the Key / Value storage on globals is elementary. We create it using the following structure:

 keyValueStore[key] = value 

For example:

 telephone["211-555-9012"] = "James, George" telephone["617-555-1414"] = "Tweed, Rob" 

In the form of a hierarchical tree, this structure looks like this:


Everything, storage of type Key / Value is implemented. However, with the help of globals we can go ahead and save several attributes for one key. For example:

 Telephone[phoneNumber, "name"] = value Telephone[phoneNumber, "address"] = value 

Example with specific data:

 telephone["211-555-9012", "name"] = "James, George" telephone["211-555-9012", "address"] = "5308, 12th Avenue, Brooklyn" telephone["617-555-1414", "name"] = "Tweed, Rob" telephone["617-555-1414", "address"] = "112 Beacon Street, Boston" 

We have created a hierarchical tree that looks like this:



Here is the code on the Node.JS API for creating the first entry in this enhanced key / value store:

 var gnode = { global: "telephone", subscripts: ["617-555-1414", "name"], data: "Tweed, Rob" }; db.set(gnode); 

NoSQL databases typically do not provide automatic methods for indexing data. However, in the database on globals, if you need access to data using an alternative key, then it is enough to create a second global key in which any field can act as a key.

For example, if we need access to the data in the name field, we must create an index global and update it with the global telephone. Designing and adding indexes rests entirely with you as a developer, but this is very simple.

To support the index in the name field, every time we add records to the global telephone, we will create an element in the global nameIndex:

 nameIndex[name, phoneNumber] = "" 

 nameIndex["James, George", "211-555-9012"] = "" nameIndex["Tweed, Rob", "617-555-1414"] = "" 

For index elements in the global do not need to save values, so we use an empty string as the value.

The diagram shows a global with phone data and an index global. The dotted line shows the implicit relationship between the index and the main global:



The index global allows us to access data by name, while the main global provides access by phone number.

A very important and powerful property of globals is storing elements in a sorted form (see the diagram below), all operations for this happen automatically when saved.

For sequential access to the value of each element, a special iterator method is provided for traversing the global.

If we need to create a telephone directory from our data, then we can consistently bypass global nameIndex elements and get addresses from global telephone using the get () method.

The iterator method for traversing is an order function. Here is an example on the Node.js API:

 gnode = { global: "nameIndex", subscripts:["James, George"] }; var nextName = db.order(gnode).result; 

This code should return the index of the item following the item with the index “James, George” in this global. I.e:

 nextName = "Tweed, Rob" 

GlobalsDB is extremely well optimized for traversing indexes in this way, so if you design your indexes well, then data retrieval in the global will be extremely fast.

Basically, we looked at all the ways to use globals as simple Key / Value repositories. By the way, our storage can be redesigned to use just one global for both data and indexes. To do this, add another index of the first level, for example:

 telephone["data", phoneNumber, "address"] = address telephone["data", phoneNumber, "name"] = name telephone["nameIndex", name, phoneNumber] = "" 

Since the physical implementation of globals is hidden from us by the abstract level, we can create structures on globals that exactly match our needs. However, if key / value storages begin to grow to enormous size, then it is necessary to take into account how this or that structure contributes to or prevents DB administration (backup and recovery, maximum database size, distribution between different shard servers). Consideration of these factors may influence the decision whether to store all data in one or create several globals.

Other Storage Types Key / Value

If we look at such Key / Value-storage as Redis, we will see that it offers several other ways to store data. Each of these methods can be very simply implemented on globals.

Lists

Redis lists are related. You can put values ​​into a list and extract values ​​from a list, get a sublist, etc.

The model of such a list on globala is very simple. For example, you can use the following structure:

 list[listName, "firstNode"] = nodeNo list[listName, "lastNode"] = nodeNo list[listName, "node", nodeNo, "value"] = value list[listName, "node", nodeNo, "nextNode"] = nextNodeNo list[listName, "node", nodeNo, "previousNode"] = prevNodeNo 

For example, a linked list called myList, containing a sequence of values:
  • Rob
  • George
  • John

can be represented as:

 list["myList", "firstNode"] = 5 list["myList", "lastNode"] = 2 list["myList", "nodeCounter"] = 5 list["myList", "node", 2, "previousNode"] = 4 list["myList", "node", 2, "value"] = "John" list["myList", "node", 4, "nextNode"] = 2 list["myList", "node", 4, "previousNode"] = 5 list["myList", "node", 4, "value"] = "George" list["myList", "node", 5, "nextNode"] = 4 list["myList", "node", 5, "value"] = "Rob" 

or graphically:


This picture shows the rarefied nature of globals. The list item number is a sequential integer. Element 5 is currently the first item in the list, so it has the nextNode attribute in which the next item in the list is stored and does not have the previousNode attribute for the previous item in the list.

The middle list item number 4 has attributes for storing the numbers of the previous and subsequent items.

Each operation that changes a linked list (inserts, extracts, deletes, shortens, etc.) must change several elements within this list, for example:
  • reset pointer to first or last node
  • add or remove item value
  • set the correct values ​​of the next and previous elements to insert or remove an element from the list

For example, to insert the new name “Chris” at the top of the list we need to change the global where the list is stored so:

 list["myList", "firstNode"] = 6 list["myList", "lastNode"] = 2 list["myList", "nodeCounter"] = 6 list["myList", "node", 2, "previousNode"] = 4 list["myList", "node", 2, "value"] = "John" list["myList", "node", 4, "nextNode"] = 2 list["myList", "node", 4, "previousNode"] = 5 list["myList", "node", 4, "value"] = "George" list["myList", "node", 5, "nextNode"] = 4 list["myList", "node", 5, "previousNode"] = 6 list["myList", "node", 5, "value"] = "Rob" list["myList", "node", 6, "nextNode"] = 5 list["myList", "node", 6, "value"] = "Chris" 

Graphic diagram of changes (what has changed is highlighted):



To bypass the list, we have to start from the first element and recursively move from element to element by number in the nextNode field until we find this field at the next element:



To find the number of elements in the list, we can perform a crawl of it, or, for maximum performance, store this number in a separate global element and update it as the list changes:

 List["myList", "count"] = noOfNodes 

It is clear that we have to write these operations as methods in order to easily and correctly manipulate the elements in the list, but this is a very simple task.

Sets

Redis sets are an unordered rowset. We can easily model them on globals:

 theSet[setName, elementValue] = "" 

You may notice that this is exactly the same as the index setting method discussed earlier. And so you can add an element to the set:

 Set: theSet["mySet", "Rob"] = "" 

Deleting an element from the set:

 Kill: theSet["mySet", "Rob"] 

To determine whether an element is in a set, we can use the data command. It returns 1 if the element is in the set, and 0 if not.

 Data: theSet["mySet", "Rob"] → 1 Data: theSet["mySet", "Robxxx"] → 0 

In the Node.js API, we can use the data method

 gnode = { global: "theset", subscripts: ["mySet", "Rob"] }; var exists = db.data(gnode).defined; 

In this example, the exists variable will get the value 1.

We can use the ordering of the global elements to display the members of the set in alphabetical order.

When using globals, there is no significant difference when modeling set and zset Redis sets.

Hashes

You've probably already noticed that a set of hashes can be implemented just like sets. In their essence, globals are stored hash tables.

 Hash[hashName, value] = "" 

2) Tabular (or column) storage

Tabular or columnar NoSQL databases such as BigTable, Cassandra, and Amazon SimpleDB allow you to store data in sparse tables, meaning that each row can contain values ​​in some, but not necessarily in all, columns.

SimpleDB in addition allows each cell in the column to contain more than one value.

Again, this means that such repositories can be modeled on globals. The following structure provides the basic features of such a repository:

 columnStore[columnName, rowId] = value 

Example with specific data:

 user["dateOfBirth", 3] = "1987-01-23" user["email", 1] = "rob@foo.com" user["email", 2] = "george@foo.com" user["name", 1] = "Rob" user["name", 2] = "George" user["name", 3] = "John" user["state", 1] = "MA" user["state", 2] = "NY" user["telephone", 2] = "211-555-4121" 

In the form of a diagram:



Again we need the rarefied nature of the global. The above global represents the following table:
nametelephoneemaildateOfBirthstate
oneRobrob@foo.comMA
2George211-555-4121george@foo.comNY
3John1987-01-23

Of course, you can also add indices to this model, for example, by row and by cell value, which must be maintained simultaneously with the main global to store columns, for example:

 userIndex["byRow", rowId, columnName] = "" userIndex["byValue", value, columnName, rowId] = "" 

3) Document-oriented storage

Document-oriented NoSQL databases such as CouchDB and MongoDB store many key / value pairs and nested sets of various attributes.

JSON or JSON-like structures are commonly used to represent the “documents” stored in these databases. GlobalsDB automatically maps JSON documents or objects to globals.

For example, consider the JSON document:

 {key:"value"} 

It can be modeled on globals like:

 Document["key"] = "value" 

In GlobalsDB we we can create this document like this:

 var json = { node: { global: "Document", subscripts: [] }, object: { key: "value" } }; db.update(json, 'object'); 

And get it from the database like this:

 var json = db.retrieve({global: 'Document'},"object"); console.log("Document = " + JSON.stringify(json.object)); 

Let's take a more complex document:

 {this:{looks:{very:"cool"}}} 

It can be represented by the following element of the global:

 Document["this", "looks", "very"] = "cool" 

And you can create it like this:

 var json = { node: { global: "Document", subscripts: [] }, object: { this: { looks: { very: "cool" } } } }; db.update(json, "object"); 

What about the array?

 ["this","is","cool"] 

It can be represented as:

 document[1] = "this" document[2] = "is" document[3] = "cool" 

GlobalsDB uses objects, not arrays, to create and retrieve data mapping to globals. Therefore, to save the array, we write this:

 var json = { node: { global: "Document", subscripts: [] }, object: { 1: "this", 2: "is", 3: "cool" } }; db.update(json, "object"); 

To get an array from the database:

 var json = db.retrieve({global: "Document"}, "object"); console.log("Document = " + JSON.stringify(json.object)); Document = {"1":"this", "2":"is", "3":"cool"} 

Here is a more complex JSON document:

 { "age": "26", "contact": { "address": { "city": "Boston", "street": "112 Beacon Street" }, "cell": "617-555-1761", "email": "rob@foo.com", "telephone": "617-555-1212" }, "knows": { "1": "George", "2": "John", "3": "Chris" }, "medications": { "1": { "dose": "5mg", "drug": "Zolmitripan" }, "2": { "dose": "500mg", "drug": "Paracetemol" } }, "name": "Rob", "sex": "Male" } 

It will be displayed on globals like this:

 person["age"] = 26 person["contact", "address", "city"] = "Boston" person["contact", "address", "street"] = "112 Beacon Street" person["contact", "cell"] = "617-555-1761" person["contact", "eMail"] = "rob@foo.com" person["contact", "telephone"] = "617-555-1212" person["knows", 1] = "George" person["knows", 2] = "John" person["knows", 3] = "Chris" person["medications", 1, "drug"] = "Zolmitripan" person["medications", 1, "dose"] = "5mg" person["medications", 2, "drug"] = "Paracetamol" person["medications", 2, "dose"] = "500mg" person["name"] = "Rob" person["sex"] = "Male" 

Or graphically:



We can create this document in GlobalsDB like this:

 var json = { node: { global: "person", subscripts: [] }, object: { name: "Rob", age: 26, knows: { 1: "George", 2: "John", 3: "Chris" }, medications: { 1: { drug: "Zolmitripan", dose: "5mg" }, 2: { drug: "Paracetemol", dose: "500mg" } }, contact: { email: "rob@foo.com", address: { street: "112 Beacon Street", city: "Boston" }, telephone: "617-555-1212", cell: "617-555-1761" }, sex: "Male" } }; db.update(json, "object"); 

And get like this:

 var json = db.retrieve({global: "person"}, "object").object; 

4) Graph databases

NoSQL bases such as Neo4j are used to represent complex networks of interconnections in terms of nodes and connections between them (so-called edges) using key / value pairs of connecting nodes and edges.

The classic use of a graph database is a representation of a social graph. Let's consider the following example:



In this example, the arrows indicate which users know (know) about others. With the help of globals, it can be represented as follows:

 person[personId, "knows", personId] = "" person[personId, "knows", personId, key) = value person[personId, "name"] = name 

The time of the state “knows” (“knows”) can be calculated from a temporary stamp that is saved during the initial creation of the connection, for example:

 person[1, "knows", 2] = "" person[1, "knows", 2, "disclosure"] = "public" person[1, "knows", 2, "timestamp"] = "2008-08-16T12:23:01Z" person[1, "knows", 7] = "" person[1, "name"] = "Rob" person[2, "name"] = "John" person[7, "knows", 2] = "" person[7, "knows", 2, "disclosure"] = "public" person[7, "knows", 2, "timestamp"] = "2009-12-16T10:06:44Z" person[7, "name"] = "George" 

or graphically (red dotted lines indicate the “know” relationship between users in this model):



If we talk about the general case of graph DB, the model will represent nodes and edges (edge) between them. Like that:

 node[nodeType, nodeId] = "" node[nodeType, nodeId, attribute] = attributeValue edge[edgeType, fromNodeId, toNodeId] = "" edge[edgeType, fromNodeId, toNodeId, attribute] = attributeValue edgeReverse[edgeType, toNodeId, fromNodeId] = "" 

Thus, the rarefied nature and flexibility of globals allows very complex and simple definition of complex graph databases.

5) Models of other databases

Modeling on globals is not limited to NoSQL data models. They can also be used to model:

  • XML DOM / Native XML databases. GlobalsDB is great for working with persistent XML DOM files. An XML document is essentially a graph that represents nodes of various types and relationships between them (for example, firstChild, lastChild, nextSibling, parent, etc.). In essence, this allows GlobalsDB to act as a native XML database. The module for Node.js ewdDOM is one lightweight implementation of such a database.
  • Relational tables. CachĂ© models relational tables on globals, so that standard SQL queries can be used. Those. GlobalsDB can be considered as the basis for the NoSQL engine, and in CachĂ©, the NOSQL features (ie, Not-Only SQL — not just SQL) are added to the database.
  • Object DB. CachĂ© models objects globally and also provides a direct mapping between objects and relational tables. Probably, now you understand how it is implemented.

Unlike the well-known NoSQL-databases, GlobalsDB is not a hard-specialized database. It also has many properties. So GlobalsDB can support any of the above database types. And even at the same time, if required.

It looks like you have Redis, CouchDB, SimpleDB, Neo4j and Native XML DB running in the same database and at the same time!

If you are interested in a NoSQL database working with Node.JS (as well as .NET, Java), you need to take a look at GlobalsDB. This is truly a Universal NoSQL database!

Conclusion

Of course, there are much more ways to use globals than are given in this article. However, I hope this review has demonstrated that they are a convenient, flexible tool for abstraction and are capable of quite simply modeling various NoSQL-bases.

The secret sauce is, of course, implementation. If it is done correctly, wise design decisions are applied, then the performance achieved is astounding.

Thanks
This article is an adaptation of the article by Rob Tweed and George James “Universal NoSql database based on proven and tested technology” (2010).

Source: https://habr.com/ru/post/185472/


All Articles