Definitions
In this article I will describe ways to create, and use rubricators, which are based on the structure of the graph.
Rubricator, categorizer, category catalog, subject index, index. For convenience, we will assume that all these terms describe about the same thing. And where there will be significant differences, we will clearly indicate them.
The information element is most often a file, but in general, any information is presented as a whole.
Introduction
Rubricators are used to solve a variety of tasks:
- To speed up the search and facilitate navigation through large amounts of information.
- For tagging (tagging) information in order to organize samples for specific categories
- To sort information by:
fields of knowledge (physics, mathematics, biology)
ways to use (Books - read, music - listen, movies - watch)
accessories (my folders and common documents)
importance (inbox and spam folders), etc.
As a rule, classical rubricators are based on a tree structure. The main advantage of such an organization rubricator is simplicity and prevalence. Each book has a table of contents - this is an example of a classical structure in the form of a tree. The book has sections, which in turn are divided into parts, chapters, and so on. The depth of the rubricator well reflects the complexity of the structure of the book. But the book, in the classical sense, is a stream of information with the property “Forward Only”. Those. the table of contents makes it easy to find a specific place in a book, and then we open the book and consistently read page by page.
Difficulties begin when a reference book acts as a book, and when using a rubricator an attempt is made to organize selective access to the content of the “SELECT * FROM BOOK WHERE TOPIC = 'Something Interesting'” command type.

Fig. 1Subject Index
The result of such attempts becomes - the subject index. This is a very convenient type of pointer. According to it, we can easily and easily find in the text of the book sections in which there is an interesting topic for us. But just here lies one of the inconveniences of this type of rubricator - it is impossible to immediately group the results scattered throughout the book.
Example: "Imitation of the material surface" is found on 4 pages. These pages are not consecutive. That is, it can be assumed that all these pages belong to different headings. But in order to find the name of the relevant column you need to do a separate work: look through the book on the desired page and read the name of the rubric in the footer, if it exists.
')
Building a rubricator in the form of a graph (not a tree)
Let's start with a small theoretical digression: “A tree is a connected graph that does not contain cycles”
From this definition it follows that the classic rubricator, built in the form of a tree, is a “truncated” version of a full-fledged rubricator, based on an undirected graph.
An example of building a rubricator in the form of a graph
To build an example of a rubricator on the basis of a graph, let’s take the area close to many new settlers - construction repairs.
Root node
Despite the fact that there is no pronounced “root” in the graph, to create a rubricator that will be based on the graph, we will draw / assign one of the nodes as the root. In the example, this will be the “everything” node.
“Everything” is one of the vertices of the graph, which has a special purpose. This vertex is the root node of the rubricator tree. (Since any tree can be represented by a graph, such an interpretation of a special purpose is quite acceptable).

The need for a root node of the rubricator is due to the "habit" of its presence. This node adds convenience when using the rubricator by a person. Any conversation, any description of the structure of the rubricator always begins with the selection of the main sections. Also, the presence of this node allows you to implement such a convenient function as "bread crumbs".
Connections
Links - this is the most valuable rubricator, which is based on the graph principle. In contrast to the classical tree-type rubricator, the graph makes it easy to specify the connections that are necessary for the completeness of the description of the subject area, but which cannot be specified within the framework of the tree structure.
Consider the organization of relations in the rubricator-graph on the example in more detail.

Fig. 2 Cycles in a heading column
In Fig. 2 (above) shows a subset of the rubricator taken from the
stroika.ru building portal
The rubric “Parquet glue” is highlighted on the example. If you follow the path through which you can get to this section, then it can be noted that the “Parquet Clay” node is reachable from the “all” node through two different branches of the rubricator. The node “Parquet glue” equally applies to both the “Glue” section and the “Parquet” section. Moreover, the task of such an attitude is natural for a rubricator based on a graph.
If desired, this scheme can be expanded by setting a priority (weight) for each arc of the graph. And then it will be possible to point out that “Parquet glue” is more glue than parquet. For example:

Fig. 3 Priority links
The ability to create cycles in the rubricator is very important when working with categories that:
- can not be clearly 100%. attributed to any one main heading.
- have a special meaning only when located in the border area. Just an example with glue for parquet. Without parquet this type of glue has no value. The value of parquet glue is precisely its applicability to parquet.
- are orthogonal to the existing rubricator structure. For example, the division of goods and rental services. A truck crane can be both sold and rented.
- A specific computer virus can be assigned to Email-Worm, to P2P-Worm and to Trojan Mailfinder, if it is simultaneously distributed via email and is a worm, and it also collects email addresses.
Here are some examples where the presence of multiple links greatly simplifies the rubrication:
- Parquet glue (This is glue and for parquet)
- Macro Virus Blocker (This is both a macro virus and a blocker)
- Rent of a truck crane (This is the rental of vehicles and crane)
- Charity concert (Both concerts and charity)
- Light green metallic color (Both shades of green and metallic)
Tops of the graph. Intermediate rubrics
The graph-rubricator consists of the root node “All”, the edges of the graph indicating the subordination of one rubric to another, peaks (intermediate rubrics) and leaves (just rubrics).
To create a strictly described rubricator, it is necessary to answer the question about the physical meaning of the vertices of the graph. Those. to the question of how the belonging of certain information will be treated at the top of the graph. Perhaps, in some cases, it will be easier to refuse to use the vertices of the graph (not the leaves!) As headings at all, than to determine the meaning of classifying the heading to the top of the graph.
Consider this question in more detail on the example:

Fig. 4 Assigning information to the sheet and the top of the graph
The information element "letter" is referred to the heading "Parquet glue". Here is the assignment of information to the sheet graph. And this is an unequivocal correspondence, which directly tells us that the letter is about parquet glue. The variant of direct, unambiguous correspondence to the rubric is the most simple and common.
A more complex option - the element “Information article” is referred to the “Glue” category. There may be discrepancies.
For example, attribution to the “Glue” heading may mean that the article is purely informational and describes such a substance as glue in general. Perhaps even without mentioning such details as "Wallpaper glue", "Rubber glue" and "Parquet glue".
Another option is when the article describes more than one specific “Parquet glue” but also “rubber glue”. In this case, the assignment of the element “information letter” to any one category (sheet of the graph) will not be completely correct.
So, when using any tree rubricator, it is necessary to decide whether the vertices of the graph (intermediate headings) will be used to mark information, or they will be used only to facilitate the compilation and navigation through the rubricator regardless of the information elements being categorized.
In the case when the decision was made to use as headings not only the leaves, but also the vertices of the graph, then it is worth thinking about the multiple rubrication of information.
Leaves. Final rubrics
The leaves of the graph are the vertices of the graph, which are connected to the other elements of the graph with only one edge. With reference to the heading column, these are the final rubrics. Those. rubrics that are not divided into subheadings.
The leaves of the rubric heading may contain additional information that may help in choosing this particular section of the rubricator. As such information may be a set of keywords.
One of the interesting ways to use the vertices and leaves of the heading rubricator may be the option when both the vertices and the leaves are essentially keywords themselves. In this case, the vertices of the graph can be used as headings, and the leaves will play the role of keywords. This version of the construction of the rubricator and the algorithm for automatic categorying will be discussed in the next article.