From translatorIn my opinion on Habré, there is little information about the free
GlobalsDB NoSQL database. The original article was written in August 2011 and is still relevant. Here is the first part. In the second part, it will be described how to use GlobalsDB to model various types of NoSQL-bases.
IntroductionGlobalsDB is a free database that uses the Global Persistent Variables mechanism for storing data. variable values ​​which are automatically stored on the hard disk. They are an abstraction over binary trees (B-trees) and can be used to store large amounts of data represented as multi-dimensional sparse arrays.
')
With the help of Global Permanent Variables (usually called “globals”, which is why the base is called GlobalsDB), you can expressly and efficiently solve all the usual tasks for which it is customary today to use various NoSQL databases.
The heart of GlobalsDB is the same engine used in the
Caché database .
The key advantage for developing on GlobalsDB is that GlobalsDB is based on an adult, proven and tested technology with repeatedly proven scalability and extremely high performance in such important and responsible areas of business as healthcare and financial services.
Unlike Caché, GlobalsDB is free to use, develop and distribute without any restrictions, which makes this platform very attractive (however, this is not an Open Source product).
This article explains how GlobalsDB can be used to model and store data of various types of NoSQL databases:
- Key / Value
- Tabular / Column Storage
- Document Oriented Storage
- Graph database.
And how, as a result, get the maximum benefits: NoSQL-capabilities, coupled with performance, reliability and maturity, which are needed for critical business applications.
GlobalsDB can be accessed through a high-performance in-process API that is available for Node.JS (Javascript), .NET, and Java.
Commands to GlobalsDB are not transmitted via ports and sockets, which would slow down the work with the database, but directly by accessing the program code GlobalsDB from the application process. For maximum performance, the GlobalsDB process is authenticated to the application process.
Access from other languages ​​will apparently be implemented in the future.
In this article, we will use the Javascript interface from Node.JS to access the GlobalsDB.
Note: The globals in this article are not variables with a global scope. In terms of GlobalsDB, globals are structures for storing data.
A brief introduction to NoSQL.The term NoSQL is just a few years old. It was coined to indicate storage technologies that meet the stringent requirements of a Web or Internet scale.
Simply put, here are three features of Web-scale:
- A lot of data or Big Data. The largest web applications (for example, Twitter, Facebook, Google) operate on the amount of data, which is several orders of magnitude more than was previously considered available for databases.
- A huge number of users , counted in millions, accessing the system constantly and simultaneously
- Complicated data structures : usually web applications work with something more complicated than ordinary tables
Relational databases that have dominated since the 80s of the last century, when applied on a web-scale, began to show their weakness in these three areas. Therefore, the developers began to look for alternatives.
Demand creates supply: NoSQL database started to appear. Despite the variety of NoSQL database types, they all have the following common features:
- for processing huge amounts of information it is distributed between servers. This process is called sharding.
- to serve a huge number of users, the load is distributed between the servers, i.e. parallel processing is used
- a simpler database device without a predefined scheme is used
All the most successful and best-known NoSQL databases have been developed from scratch over the past few years. Strange. It looks so that no one has looked for existing, well-implemented database technologies that could provide a solid foundation for working with Web-scale.
This article aims to demonstrate to you that GlobalsDB can be an excellent basis for creating NoSQL databases using existing, proven and tested technology.
GlobalsDB combined with extremely high performance Node.JS provides an ideal, powerful high-performance industrial platform for implementing NoSQL databases.
Global overviewLet me remind you again that by globala we understand structures for data storage, and by GlobalsDB a database built on globala.
Globals:
- do not have a predetermined structure (schema-free). Globals, unlike tables, do not need a definition (CREATE TABLE) before you can start working with them.
- hierarchically structured (talking about the internal structure of each individual global)
- rarefied
- dynamic
You may think that these are auto-conserved associative arrays and you will get far from the truth.
Some examples of globals:
myTable["101-22-2238", "Chicago", 2] = "Some information" account["New York", "026002561", 35120218433001] = 123456.45
Each global has a name (like arrays have names). This is followed by several indices, the values ​​of which can be numeric or string. You can have any number of indexes. There is a restriction only on their total length, which is quite large.
Each element of the global (which is defined by the name of the global and a combination of indices) stores a text string. Blank lines are allowed.
You can create or delete global elements whenever you want. Everything happens dynamically and does not require any declarations or data schema definitions.
All high-level abstractions and data schemes are in the developer’s head, and the database is physically stored in globals.
Globals do not provide secondary indexes, therefore, to ensure high-speed search and queries, the developer must create and maintain additional elements in globals that will play the role of secondary indexes.
Globals are extremely versatile and can easily be used to simulate all 4 types of NoSQL databases with comparable or in many cases significantly superior performance:
- Key / Value Storage (Redis, memcached)
- table / column storage (BigTable, Cassandra, SimpleDB)
- document-oriented databases (CouchDB, MongoDB)
- graph bases (Neo4j)
The second part will show detailed examples of how to do this.
Performance like bases in RAM, integrity like bases on disk.One of the main reasons for the extremely high performance of GlobalsDB is the intelligent caching mechanism that has been improved and optimized over the years in Caché, from which GlobalsDB inherited the database engine.
As a result, in most cases, global elements to which access is required are already in RAM.
Most NoSQL databases still have immature relationships between working with data stored in memory (for speed) and data placed on disk (for persistent storage and integrity) and vulnerable to cases where a node or shard server
suddenly shuts down .
For comparison, in similar situations, Caché and, as a result, GlobalsDB behave much more reliably.
For decades, the Caché core has been developed and exploited in demanding areas such as healthcare, banking and finance. In these areas, performance, integrity and availability in a non-stop mode are fundamental.
The result of long-term development, debugging and testing has become the speed similar to the databases placed in the RAM and the data integrity inherent to the databases working with the disk.
The goals that defined and drove the development of the engine database inside GlobalsDB almost completely coincide with the goals of the NoSQL movement.
NoSQL database featuresIn 2010, TechRepublic published the post
“10 Things You Should Know about NoSQL Databases” .
It contains 5 advantages and 5 weak points characteristic of NoSQL databases.
If we consider GlobalsDB from the point of view of this post, then all 5 advantages are inherent in it, together with the fact that it is more interesting, 4 out of 5 drawbacks do not apply to it, which other NoSQL databases cannot boast.
Let's go over these criteria.
5 advantages :
- elastic scaling
- a lot of data
- Goodbye, database administrators!
- economy
- flexible data models
5 weak points :
- immaturity of technology
- support
- analytics and business analytics
- administration
- number of experts
Consider each advantage in turn:
- Elastic scaling. For many years, Caché has been scaling support for multiple servers, each of which can be a regular, inexpensive machine.
For Caché, an ECP network technology has been created that allows you to work transparently with globals that are physically distributed across multiple servers.
Currently GlobalsDB does not include ECP, but if elastic scaling is required, programs for GlobalsDB can be transferred to Caché: Caché allows you to run any application for GlobalsDB without any modifications. - Lots of data. GlobalsDB is designed to work with data volumes that lie far beyond conventional relational databases. At the same time with extremely high performance.
- Goodbye, database administrators! Interestingly, Intersystems (supplier Caché) has used this argument in marketing for many years. There are systems on globals that have worked without any supervision for decades.
- Efficiency. GlobalsDB works fine on inexpensive, ordinary equipment and squeezes the maximum level of performance out of it. In the Massachusetts healthcare system, tens of thousands of users in the 1980s and 1990s were working simultaneously on a network cluster of hundreds of ordinary PCs running Caché's predecessor database under MS-DOS.
- Flexible data models. This is the very essence of storage on globala, which will be shown later.
As for the weak points:
- Immaturity technology Unlike new NoSQL-bases, Caché has a long pedigree and an outstanding track record. Caché employs huge, complex databases in demanding business sectors. GlobalsDB, which has the same database engine, is a fast, extremely reliable and stable technology that can be used with confidence in critical areas of business.
- Support. Caché dominates the US healthcare industry and is heavily used in the financial sector. One of the reasons for this trust is the quality of commercial support from InterSystems. GlobalsDB is provided free of charge and without any support, but if it is required, applications for GlobalsDB can be transferred without any changes to a fully supported Caché.
- Analytics and business analytics. Interestingly, InterSystems now widely advertises a product called DeepSee. It is created specifically for analytics and works on top of Caché. For many years, you can connect to Caché using SQL-based business intelligence tools. So if this functionality is needed, applications for GlobalsDB can be transferred to Caché without any changes.
- Administration. GlobalsDB is easy to install and maintain, and, like Caché, can be used in situations where there are only a few, if any, IT people.
- The number of experts. This is the only area where GlobalsDB positions are weak. The number of Caché users is constantly growing. However, the number of experienced professionals with experience in working with databases based on globalization is very small compared to the number of people with experience working with relational databases. In fact, one of the goals of the development of GlobalsDB is to expand the community of developers and users who learn about Caché through GlobalsDB.
By the sum of all characteristics, the free GlobalsDB database is an ideal candidate for businesses that need NoSQL technology, but it requires an adult, fast and reliable product.
A simple way to upgrade to a commercial Caché product gives you access to additional features and high-tech NoSQL functionality.
Simulation of various types of NoSQL databases.Continued. Part 2.