
Hello. In this article, I would like to write about one very interesting way of storing hierarchical structures in databases that is not related to the generally accepted and well-known three (Adjacency List, Nested Set, Materialized Path). I did not find any mention of it on the Internet, which I was very surprised about, because, in my opinion, this is the best and only way to store hierarchical structures. When developing the
console-like forum, I used this way, I don’t regret a gram. This is an author's article and not a single sentence was inserted by the copy-paste method.
It is possible that this method is known and it is called otherwise. If so, I am pleased to find out what it really is called.
IMHO, this method is most similar to the Materialized Path, with a slight shade of the Adjacency List. Compared to the Adjacency List, it slightly denormalizes the database, but with the advantages it provides, it does not play a significant role.
')
The essence of the method
For each hierarchical structure we create two tables - a table with data and a table with hierarchy. In the data table, we store what we need, the only cell that interests us here is PRIMARY (`ID`)
In the hierarchy table, we store a list of all items and their parents with a level for each parent. That is, if the element has 5 parents - we store 5 format records (ElemID, ParentID, Level). Thus, to describe the deepest element, you need a number of elements equal to the level of nesting.
You can be horrified: "Oh God, this is so a base swell!". But this is not so critical - there are only three int fields in the hierarchy table, compared to a dozen or so text fields in the data table. That is, despite the number of rows, the table with the hierarchy will be light enough.
Examples
So, what tools for the example I use:
- php 5.3. It is important that the objects are passed by reference, because I recommend not using version 4, or significantly reforming the code.
- mysql. The basic features of MySQL are used, because the version is not so important. The only thing is that if you choose InnoDB, I think it will be better
Initially, I will post the code, which will be explained later in the article:
I use a handwritten class for working with the database (LGPL):
pastebin.com/f2642074f . It is in it that we change the connection parameters.
Now, regarding the code itself:
We determine which item we will store in the database. This is an object of class Elem with properties $ id, $ data, $ children, $ parent:
pastebin.com/f78f943d8Ready functions with which we work with base, and further they will be described:
pastebin.com/f314afb10We will also create two tables in our database:
CREATE TABLE `Elements` (
`ID` int (11) NOT NULL AUTO_INCREMENT,
` Data ` varchar (64) NOT NULL ,
PRIMARY KEY (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE `ElementsHierarchy` (
`ElemID` int (11) NOT NULL ,
`ParentID` int (11) DEFAULT NULL ,
` Level ` int (11) NOT NULL ,
KEY `ElemID` (`ElemID`),
KEY `ParentID` (`ParentID`),
KEY ` Level ` (` Level `)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
* This source code was highlighted with Source Code Highlighter .
Explanation of examples

Immediately after connecting the base files, in order to get the test data, we will call the function generateData, where 4 is the number of elements per level, 3 is the depth of nesting.
$tree = generateData(4, 3, null);
On the right you can see one of the 4 elements that we now have in the $ tree array
If we make print_r ($ tree), then we can see that we have an array of four trees in (1 + 4 + 4 * 4 + 4 * 4 * 4) = 85 elements, that is, 340 elements together. As $ elem-> data will be its level. Something like this:
#.1
#.1.1
#.1.1.1
#.1.1.2
#.1.2
#.2
#.2.1
#.2.2
Add data to the table
In order to insert them, I wrote the function insertTreeRecursive. For each element we need two or three requests. One is for inserting data into the `Elements` table, one is for taking data from the` ElementsHierarchy` table about parents, and one is for inserting data about parents into the `ElementsHierarchy` table. Now more on the example function
I think up to this point nothing heavy:
$elem->setId(Db::execute($query));
We insert a row into the `Elements` table, get last_insert_id and assign it to the current element. Suppose the ID of the inserted record is 37
If this element does not have parents, then a query is sent to the database a la:
INSERT INTO `ElementsHierarchy`(`ElemID`,`ParentID`,`Level`) VALUES ('ID', NULL, 1)
That is, it is clear that the element at the first level is from the missing element (NULL), that is, it has no parents.
But if the element has a parent, then we get a list of all parents of the parent (for example, the ID of the nearest parent is 15). It turns out something like this:
| ElemID | ParentID | Level |
| 15 | NULL | 4 |
| 15 | 1 | 3 |
| 15 | 5 | 2 |
| 15 | 10 | 1 |
At each line, we replace the ElemID with the ID of the current element, increment the Level by one and append the ID = 15 and Level = 1 to the end of the array.
We have the following:
| ElemID | ParentID | Level |
| 37 | NULL | 5 |
| 37 | 1 | 4 |
| 37 | 5 | 3 |
| 37 | 10 | 2 |
| 37 | 15 | 1 |
This is the structure that we add to the end of the table `ElementsHierarchy`. For our $ tree, the size of 430 elements, it turned out 1252 rows in the hierarchy table. The lower the average level of nesting - the smaller the table and vice versa. Please note that, despite the depth of nesting, 3 simple queries are enough to insert one element.
Get data from the table
We use the getRowsByParent function to get data from the table:
$tree = getRowsByParent(2);
print_r($tree);
We see that we have derived element # 1.1 and the whole tree in the subset # 1.1. *
But `ParentID` we have not the nearest parent, and the parent for which we carried out the search. Therefore, we will slightly improve our query and also make a cycle that will format the results of the query into an object. Call
$tree = getTreeByParent(2);
print_r($tree);
and we get exactly what we need! One SELECT'om!
Good, but let's look further. But how to produce UPDATE tables? Let's add another field to the `Elements` table as a test:
ALTER TABLE `Elements` ADD `Marked` BOOL NOT NULL
And mark all the cells that are in the depth of tree 2:
getTreeByParent (2); We look at the base and see that everything turned out. Removal is the same.
Comparison of this method and Materialized Path
At the suggestion of the community, a test was conducted to compare the sampling rate from the base of this method and the Materialized Path method. And I have very impressive results. I generated ten 11110 trees with 11111 elements, 5 * 10 in size, that is 111110 elements in the database and made tests on them: I generated 20 random IDs and searched for the complete tree under them in the database. As a result, in all cases,
Full Hierarchy shows itself 5-8 times better than the Materialized Path (you can check it yourself, all the source codes of the tests are laid out):
pastebin.com/f65b2fb7d - functions that were used during testing.
pastebin.com/f793f7c74 - self testing and its result.
Structure of the `ElementsMP` base:
CREATE TABLE `ElementsMP` (
`ID` int (11) NOT NULL AUTO_INCREMENT,
` Data ` varchar (64) NOT NULL ,
` Path ` varchar (1000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL ,
PRIMARY KEY (`ID`),
KEY ` Path ` (` Path `)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
* This source code was highlighted with Source Code Highlighter .
Moreover, the base does not allow the `Path` field to be longer than 1000 characters (# 1071 - Specified key was too long; max key length is 1000 bytes), which means that if we have an average ID length of 4 characters we will not be able to trees deeper than 1000 / (4 + 1), that is, the deepest possible tree in this case is 200 elements. and 166 with an average key length of 5 digits (if the site has more than 50,000 comments on average)
Total
Perhaps the code above has some flaws, but I am sure that all flaws will be easily corrected with time and experience. If many people like the way, then I think it will be possible to get together and write a library for more convenient work with it.
Acknowledgments
Thank you DkZ from FreeCR konfa for drawing the logo for the theme.
Thanks to Kostyan for pushing me on this idea.