Not so long ago, articles on the global in Caché were posted on the InterSystems blog on Habré, what it is prepared with and how it is served (
part 1 and
part 2 ). This is all, of course, interesting, the convenience of working with any data models that the developer wishes. But what provides a good speed of handling these globals?

Theory
The Caché database is a directory with the name of the database, which contains the file CACHE.DAT. On * nix-systems, a disk partition can act as a database.
Data in Caché is stored in blocks, and those, in turn, are organized as a balanced B * tree. If we recall that we store globals in the form of a tree in a simplified understanding, the globals themselves will be the globals themselves, and the globals will be the leaves. The difference between a balanced B * tree and a regular B-tree is that the branches have right-hand links, which help (in our case with globals) bypass the indices rather quickly using the
$ Order and
$ Query functions, without having to go up to the tree trunk.
')
The block size in the database file is fixed, by default it is 8192B, but it is possible to allow creating a database with block sizes of 16kB, 32kB and 64kB. The system developer can select the required block size depending on the nature of the data he plans to store. But it must always be borne in mind that data is read block by block - even if a single value of 1 byte is requested, several blocks will be read, and only the last block in this chain will contain the data. Caché also has different global buffers — you cannot mount or create a new database if the global buffer is not configured with the appropriate block size — this will lead to an error.


The picture just allocated memory for the global buffer for databases with 8kB blocks - only such bases with 8kB blocks will work in this system. Blocks in the database are grouped into cards, one card in the case of an 8kB block describes 62464 blocks, and is stored in the card block that goes first in the card.
Types of blocks
There are several types of blocks. At each block level, the right link must point to a block of the same type, or to a zero block, which may mean that there is no further data.
- type 9: global catalog block. Used to describe all available globals and their parameters. For each global in this block collation of global indexes is defined - a very important parameter responsible for data sorting. This parameter cannot be changed after global creation;
- type 66: top level indicator block. Only the global catalog block can point to this block from the top;
- type 6: block of pointers of the lower level, above it must be blocks of pointers of the upper level, and below only blocks of data;
- type 70: block of upper and lower level indicators immediately. It is used when there are not so many values in the global and there is no need to create several levels of blocks. Indicates data blocks — blocks of the global directory refer to such blocks;
- type 2: block of pointers, for storing a large enough global. It may take more levels of pointer blocks to ensure a uniform distribution of values across the level of data blocks. This block is located between the pointer blocks;
- type 8: data block. In such a block, the values are stored not for one node of globals but for several at once;
- type 24: block storage of large lines. If the value of one global cannot be placed in one block, then it is placed in such a block, and the node in the data block stores references to the list of blocks of large strings, as well as the total length of the value;
- type 16: card blocks. Required to store information about which blocks are currently free.
So, in the first block of the Caché database is the service information about the database file. In the second - a block map. And the first block of the catalog goes third (block number 3) and there can be several such blocks of the catalog for the database. Next are pointer blocks (branches), data blocks (leaves of trees) and blocks of large rows. As I wrote above, the block (s) of the global directory stores information about all available globals in the database. It can also store global settings even if there is no data in such a global. In this case, the node describing such a global will have a null lower link. You can view a list of globals from the global catalog through the management portal. You can also enable the ability to save the global in the directory after deletion - for example, to save the sort.

In the same place, you can create a new global - in this case, you can immediately set up any available sorting and select it different from the one that is installed by default in the database.

In general, the tree of blocks can be represented as in the picture below. Blocks are marked in red.

Database integrity
To date, the development of the Caché DBMS, the possible cases and errors that could lead to the degradation of the database, are minimized, and the need to repair the database occurs less and less. But in any case, the integrity check is recommended regularly on an automatic basis. To do this, there is the ^ Integrity utility, which can be launched through the terminal from the% SYS area, through the management portal, on the Databases page, and also through the task manager. By the way, the task of automatic integrity checking is already configured by default, but it is disabled - you just need to activate it:


In the process of checking the integrity, the correctness of the indication of lower links, the correctness of block types is checked, the right links are checked. Globals are also compared to match the sort order. If, as a result of the integrity check, errors were found, you can use the ^ REPAIR utility, which can be run in the% SYS area. This utility allows you to view any block as well as edit it if necessary, i.e. repair db.
Practice
But all this is theory. What the global and its blocks look like is actually quite difficult to judge. The only available way to view blocks is the ^ REPAIR utility mentioned above. The output of this program looks like this:

I recently started working on a project that allows you to walk through the block tree, without the risk of damaging the database, and conveniently viewed in a browser, with the ability to save this visualization in SVG or PNG format. The project is called CacheBlocksExplorer, the source of the project is laid out on
Github .

Of the features implemented:
- The ability to view any configured or simply mounted database in the system;
- Ability to view information on the block, block type, right pointer, a list of nodes with their links;
- When you click on any node that has a link to the lower block, the information on this block will be loaded and displayed, as well as the link to the loaded block;
- It is possible to remove some blocks from the display, simply by removing the link, this will not lead to changes in the data.
What else needs to be done:
- Display of the right link: at the moment it is displayed as information on the block, I would like to show it in the form of an arrow;
- Support for displaying blocks of large lines: now they are simply not displayed;
- Displays all blocks of the global catalog, not just the third one.
I also wanted to display the whole tree at once, but until I found such a suitable library that could quickly display several hundred thousand blocks along with links, it turns out very slowly in the current library and rendering in the browser was slower than reading this structure in Caché .
In the next article I will explain in more detail with examples how everything works and what can be learned about our globals and blocks if you have a tool like the Cache Block Explorer developed by me.