Translator's note.
There is an interesting technology in the world of DB - MUMPS . This programming language and data access has been known for several decades, debugged and is an adult proven technology.
Let me give you an analogy: if SQL can be compared with BASIC, then MUMPS is more similar to C - it gives the highest performance, flexibility and versatility, allowing you to create the most complex data structures.
')
Here is the translation of the first part of the article “Extreme Database programming with MUMPS Globals”. If it seems interesting to the community, then the second part will be translated.
Chapter 1. Globals: An Alternative to the Relational Approach
The true heart of MUMPS technology is the data storage mechanism. It is based on what is called a global variable, or (more popular name) globals (Globals). Globala is an incredibly simple concept, however, it is also incredibly powerful.
Many people do not accept globals. These are primitive structures. They do not have controls, support systems or advanced functionality provided by “decent” databases. As a result, MUMPS is often written off as useless, insufficient, or, in general, the wrong system. If you do it too, you ignore the simple, convenient and very flexible data storage mechanism. In skillful hands, the lack of excess baggage and undemanding of resources give an incredible feeling of freedom. In unskillful hands, all the same can lead to disaster. This is a bit like an extreme sport, such as free climbing. Few "real" athletes will recommend it - because it is the fastest, most effective and fascinating way to climb that mountain if you can do climbing without insurance. However, companies-manufacturers of mountaineering equipment would not recommend this sport for anything, as this means reducing the demand for their own products!
So if you accept the call and are ready to admit that there are other ways to organize a data warehouse apart from the scowling relational and SQL approach, let's dig into the depths.
This is the first chapter in which the basics of globals are revealed. The second chapter focuses on using them in terms that the RDBMS / SQL programmer understands. If you want, you can go directly to the second chapter and return to this text later to familiarize yourself with the basics.
All MUMPS systems and their modern modifications, for example, Caché, use globals as the basis of the storage mechanism. Many modern MUMPS systems and / or extensions offer a more “familiar” look at key global designs. For example, on globals, you can create views (views) that will work as an object, relational or XML database. Moreover, in theory, the same physical database based on globals can be logically represented and used in one of these forms (and even in all at once). As a result, there are many developers who do not know that the tools are based on globals. They do not know what it is, how it works and how it is used. This text will help you learn this secret and, we hope that you will agree, a very fascinating world.
So what are globals?In short, a global is a constant, sparse, dynamic, multidimensional array containing text values. MUMPS permits the use of both persistent and multidimensional arrays contained only in RAM, known as “local arrays”.
Unlike many systems, MUMPS allows you to use both alphabetic and numeric indices. So in MUMPS you can work with a similar array:
Employee(company,country,office,employeeNumber) = employeeDetails
and a separate entry:
Employee(“MGW”,”UK”,”London”,1) = “Rob Tweed`Director`020 8404 3054”
In this example, the data elements that make up information about the employee (name, title, telephone number) are attached to each other in combination with the reverse apostrophe symbol as a separator. MUMPS does not control or control the way you organize your structures: there is no schema or dictionary of data describing your records. This provides incredible flexibility and speeds up development. You can optionally assign one or more data delimiters that will break the text value of the array record into any number of “fields”. The maximum total length of a single array entry depends on the implementation of MUMPS, however in Caché it is up to 32k. The length of the string is variable, and, as you understand, based on the presence of a delimiter character, the length of the individual fields is also variable. This makes MUMPS globals a very efficient data storage mechanism (as opposed to tables): there is practically no space on the MUMPS files that can store empty, unused spaces.
In the example above, the employee record is contained in what is called a “local array”. If you left the MUMPS session, the array would disappear, just like a PHP array after the page or session disappeared.
And now let's start the fun! To keep the employee's record on disk all the time, that is, in the form of a global, simply add "^" in front of the array name:
^Employee(company,country,office,employeeNumber) = employeeDetails
example:
^Employee(“MGW”,”UK”,”London”,1) = “Rob Tweed`Director`020 8404 3054” !
To create such an element in the global, you need to use the MUMPS "set" command:
set ^Employee(“MGW”,”UK”,”London”,1) = “Rob Tweed`Director`020 8404 3054”
Most of all people in MUMPS are confused and frightened that the main part of commands can be reduced to one letter (lower case or upper case), so often you can see the following instead of the command written above:
s ^Employee(“MGW”,”UK”,”London”,1) = “Rob Tweed`Director`020 8404 3054”
Now, when you exit MUMPS, the recording will not disappear, but will remain on the disk. Whenever you return, you can read a record from the disk through the global indexed:
^Employee(“MGW”,”UK”,”London”,1)
To extract a value from the global and assign it to a variable, usually use the “set” command, for example:
Set companyRecord= ^Employee(“MGW”,”UK”,”London”,1)
The variable companyRecord now contains the string value: “Rob Tweed`Director`020 8404 3054”
Globals can contain as many indices as you want, and these indices can be any combination of text and numbers (real or integer). String indices should be surrounded by double quotes, and numeric - not.
Most MUMPS implementations actually have a limit on the total length of the index, so you won't be able to go completely beyond the boundaries. However, you make sure that there is enough space for any of your requirements.
To remove an entry from the global, you can use the “kill” command:
Kill ^Employee(“MGW”,”UK”,”London”,1)
Actually, this is all about globala. The real difficulty is to make such a primitive data structure work for you. We will describe this throughout the rest of the document. We will try to do this in such a way that the SQL programmer of relational databases can understand the similar techniques and views that the MUMPS programmer would use.
Remember that nothing in MUMPS supports any database design methodology. You can add monitoring and verification tools that will follow the logic, consistency and absence of errors in your database. This means that you will do the work that the usual DBMS would do for you. However, you will quickly realize that you can automate most repetitive tasks and easily manage your MUMPS database.
Creating a simple layered structureYou can use several dimensions in the same global at the same time, for example:
^Employee(“MGW”,”UK”,”London”) = 2 ^Employee(“MGW”,”UK”,”London”,1) = “Rob Tweed`Director`020 8404 3054” ^Employee(“MGW”,”UK”,”London”,2) = “Chris Munt`Director`01737 371457”
Using 3 indices, we indicate the number of employees in a particular office, and four indicate the record of a particular employee.
Links between different globals should be indicated by the programmer. MUMPS, out of the box, does not provide automatic indexing and does not cross-reference.
In order to make a quick sample by phone number, we will create a global ^ EmployeeTelephone, where the indices will be phone numbers and the index values of the global ^ Employee, pointing to the employee's record:
^EmployeeTelephone(“020 8404 3054) = “MGW`UK`London`1” ^EmployeeTelephone(“01737 371457”) = “MGW`UK`London`2”
In this example, the value stored in each telephone number also contains a glue index pointing to the corresponding employee record. If you know the phone number, then all you need is to separate the data using the reverse apostrophe (separator) and extract the employee's master record.
Example:
S telNo=”020 8404 3054” S indexData=^EmployeeTelephone(telNo) S company=$piece(indexData,”`”,1) S country=$piece(indexData,”`”,2) S office=$piece(indexData,”`”,3) S employeeNo=$piece(indexData,”`”,4) S record=^Employee(company,country,office,employeeNo) S employee=$piece(record,”`”,1)
Note the use of the $ piece function of the MUMPS to split values into fields.
One of the great features of MUMPS is that you don’t need to declare anything in advance. In the SQL world, changing the data structure requires changing the structure of the tables (ALTER TABLE, etc.). In MUMPS, you decide when and how to create, edit and delete entries in the global - all this is automatically and dynamically controlled only by you. You can add additional indexes and “fields” (pieces) to your globals at any time without any announcements or the like. If you want to start using another global one, just start it and it will be created dynamically for you.
Creating, reading and deleting entries in the globalIn short, in MUMPS, entries in globals are created and retrieved with the “Set” command, and deleted with the “Kill” command.
- Creating an entry in the global:
Set ^Employee(“MGW”,”UK”,”London”,1)=“Rob Tweed`Director`020 8404 3054”
This code creates an index in the global and saves writing to disk.
- Reading a record from the global
Set data=^Employee(“MGW”,”UK”,”London”,1)
This code retrieves the element from the global and places the value in a local variable called “data”.
- Deleting a record from the global:
kill ^Employee(“MGW”,”UK”,”London”,1)
This code deletes the specified global record from disk immediately and permanently. Be careful with the Kill team — it's incredibly easy and incredibly dangerous to use. If you specify fewer indexes than there are in the global - you will remove all items indexed by a large number of indexes. If you do not specify an index at all, you will delete the entire global, for example:
Kill ^Employee(“MGW”,”UK”)
This line will delete all records of all offices in England.
Kill ^Employee
And this command will delete the entire Global ^ Employee - immediately, forever and irrevocably (if you have not made its backup).
Global traversalOne of the most commonly used functions is the crawling of several or all globals. Suppose you need to take all employee records to display a list from which the user can select one record or count them. To do this, you need to use the MUMPS $ order function. The $ order function is one of the MUMPS pearls that allows you to manipulate any data stored in globals with incredible simplicity.
This functionality will not be intuitive to the programmer of "traditional" databases, but it is worth understanding, since it is both very powerful and simple.
The $ order function works at the same level of indexing within the global and returns the value of the next index in the sample that exists at this level of the global. You can specify an initial value, and the $ order function will find the next value that is in the selected sequence. To find the first index at the specified level, use an empty initial value (“”). To find the first index at the specified level, which starts at “C”, use an initial value that goes in order immediately before “C”, for example, “B ~”.
So, to find the first employee in the global:
S company=”” S company=$order(^Employee(company))
The variable company now contains the value of the first company index in the first record of the global ^ Employee.
When the last value is found, the next time the $ order function is run, it returns an empty value. So, if there was only one company in our global and we repeated $ order:
S company=$order(^Employee(company))
Then the company variable would contain an empty value ("")
To get and process all the companies in the global Employee, you need to create a cycle:
S company=”” For s company=$order(^Employee(company)) quit:company=”” do . ; - company
This code demonstrates the interesting features of the concise MUMPS coding:
- The For command with two subsequent spaces sets an infinite loop.
- Quit: company = ”” indicates a loop exit condition and uses a construct known as “post-condition”. This construct says "if the company value is empty, exit the For loop." Quit should be followed by two spaces, if any other command follows the command.
- A “do” at the end of a line means to execute code that goes to the next level of the “point”. “Do” will be executed on each iteration of the For loop until the value of the company is not empty.
- The code to be executed in each cycle is written after a single point. In principle, any line beginning with a point forms a subroutine that is executed by the “do” command at the end of the second line in the example.
So, we set $ order to null to ensure that the command starts by searching for the first index stored in the global. We go through each value until the stored data runs out and we get an empty value returned by $ order, after which we end the cycle.
You will find that this type of cycle is one of the most common things in the MUMPS program.
We can expand this query and process the entire global at once. To do this, we run a loop at each index level, starting with the first and moving on to each subsequent one:
s company=”” for s company=$order(^Employee(company)) quit:company=”” do . s country=”” . for s country=$order(^Employee(company,country)) quit:country=”” do . . s office=”” . . for s office=$order(^Employee(company,country,office)) quit:office=”” do . . . s employeeNo=”” . . . for s employeeNo=$order(^Employee(company,country,office,employeeNo)) quit:employeeNo=”” do . . . . s record=^Employee(company,country,office,employeeNo) . . . . ; -
Notice how we designated the “point” levels to create a hierarchy of nested subroutines. Also, pay attention to how $ order is used to bypass the values of all 4 global indices.
Note translator : in Caché braces can be used for nested structures.
If we needed to find and process only the records of employees of companies starting with “C”, then we use the following popular construct:
s company=”B~” fs company=$o(^Employee(company)) quit:$e(company,1) '= ”C” do . ; -
Notice the $ extract (or $ e) function. It retrieves the character from the specified position of the string value: in this case, the first letter of the company name. Also note the method of specifying "not equal to C": it uses the MUMPS NOT operator, which is written with a single quote (').
This cycle can be read as follows:
- The initial value of $ order is the value before the “C” in the sequential sample
- Run an infinite loop to search all the records of companies in the sample.
- If the first letter of the company name is not “C”, leave the loop
- Otherwise, continue processing company records.
This ability to start and stop global crawling from any particular index and at any level of index nesting is a unique function of MUMPS globals.
Check the existence of a record in the globalYou often need to know if a particular entry exists in the global. For this you can use the MUMPS $ data function:
if $data(^Employee(company)) do xxx
The line reads “if ^ Employee (company) exists, then execute subroutine xxx”. The $ data function can be abbreviated to $ d.
$ data returns several different values.
- If data exists at the specified index level and there are no subnodes, the value 1 will return.
- If data exists at the specified index level and there are subnodes, the value 11 will return
- If data does not exist at the specified index level, but there are sub-nodes, the value 10 will return
- If the data does not exist at the specified index level, the value 0 is returned.
In MUMPS, any non-zero value when using the logical if operator is evaluated as true. Thus, the first three values returned by $ data (1, 10, and 11) will be considered true. The latter situation (no data and subnodes) is evaluated as false.
As an example, consider the following global:
^test=3 ^test(“a”)=1 ^test(“a”,”b”,”c”)=2 ^test(“a”,”b”,”d”)=3 $data(^test) = 11 $data(^test(“a”,”b”)=10 $data(^test(“a”,”b”,”c”)=1 $data(^test(“b”)=0
Preventing "data undefined" errorsIf you try to extract a non-existing record from the global, MUMPS will generate a run-time error, for example, <UNDEF>. To avoid this, you can either use the $ data function to check the existence of a node, or use the more convenient $ get function. This will return the value of the global node if it exists, or an empty value if not. The $ get function can be reduced to $ g
So, based on the example we used in the previous section with $ data:
$get(^test) = 3 $get(^test(“a”,”b”)=”” ; $get(^test(“a”,”b”,”c”)=1 $get(^test(“b”)=”” $get(^nonExistentGlobal)=””
View globalsCaché and all other MUMPS systems include ways to view globals. The easiest way is to use the ^% G command line command, which you will find in Caché and some other MUMPS implementations. Run the following in the terminal session:
USER> D ^%G Global ^ Employee
Introducing the global name will give the command to display the contents of the entire global. However, you can output separate indexes, for example:
Employee()
List all values of the first index only.
Employee(“MGW”
List all employee records with the first value of the MGW index
Employee(“MGW”,)
List all second employee record indexes with the first “MGW” index
Caché provides a browser application with a graphical interface called Caché Explorer that allows you to view and edit globals.
SummaryWe looked at the basics of globals, as well as their creation and editing. The next chapter will look at globals from the point of view of a person familiar with relational databases.
Probably, you have already understood that MUMPS globals practically do not control and do not limit your actions. This is both wonderful (you can quickly and flexibly design, create, and change the structure of your databases) and dangerous (in unskilled hands, the database turns into an uncontrollable mess).
MUMPS relies on you programmers. There is no insurance, but there are no limits to what can be achieved or how you will achieve it. You will understand MUMPS - a great and exciting working environment, thanks to the effectiveness of programming and command execution.
When you learn MUMPS globals, you will probably be surprised: “Why all the databases cannot work as well!” This is very simple, intuitive, flexible and exceeds any relational database in performance. In addition, MUMPS is available for almost any platform and scales to the size of huge systems - some of the largest interactive systems in the world are based on MUMPS, sometimes with tens of thousands of concurrent users.
However, if you think that you need the tools and insurance that, in the opinion of the relational world, are necessary, MUMPS is definitely not for you. And if you still intend to do free climbing, go to the next chapter.
Continued. Part 2.