📜 ⬆️ ⬇️

Data structure 2-3-4 tree

When I first encountered the topic of binary trees in programming, I immediately found answers to almost all the questions I had on Habré, but as time went on, there were more questions, and quite recently I found a topic that was not covered in this resource yet - this is 2 -3-4 trees. There is an excellent article on the topic of 2-3 trees, in which you can find answers to the questions “What is a heap?”, “What are 2-3 trees”, as well as information about the main operations with the structure, so I will not repeat and go straight to to the main topic.

So, the main difference between 2-3-4 trees from 2-3 is that they can contain more than three children nodes, which makes it possible to create four-place nodes (nodes that have four child nodes and three data elements). You can see the differences visually on the gif under these text. On the first slide there are 2-3 trees, on the second - 2-3-4.



Some requirements are put forward to the data placed in nodes of 2-3-4 trees (as well as to data placed in 2-3 trees) - you can see a short image of this information in the picture below the text:
')
  1. If a node contains 2 elements and has 2 child nodes, then the node must contain one element, the value of which must be greater than the values ​​of the left child node and less than the values ​​of the right child node
  2. If a node contains 2 elements and has 3 child nodes, then the node must satisfy the following relations: the value of X is greater than the values ​​of the left child node and less than the values ​​of the middle child node; the value of Z is greater than the values ​​of the middle child node and less than the values ​​of the right child node.
  3. If a node contains 3 elements and has 4 child nodes, then the node must satisfy the following relations: the value of X is greater than the values ​​of the left child node and less than the values ​​of the left middle child node; the value of Y is greater than the values ​​of the left middle child of the node and less than the values ​​of the right middle of the child node; Z value is greater than the values ​​of the right middle child node and less than the values ​​of the right child node.
  4. The sheet can contain one, two or three elements.



The main advantages of 2-3-4 trees in comparison with 2-3 trees are that the standard operations of insertion and removal of elements are carried out in fewer steps. The main disadvantage is the amount of required memory, because since 2-3-4 trees can contain more elements that need to be stored somewhere, it will be necessary to consume more memory. In order to eliminate this problem, you can use a red-black binary tree (red-black tree) of a special type, but we'll talk about them later.

Before describing the principles of basic operations of 2-3-4 trees, I will give an example of code from the book Data Abstraction and Problem Solving in C ++, how to describe a 2-3-4 tree as a class.

lass TreeNode { private: TreeltemType smallltem, middleltem, largeltem; TreeNode *leftChildPtr, *lMidChildPtr, *rMidChildPtr, *rightChildPtr; friend class TwoThreeFourTree; }; 

Search 2-3-4 tree


The search algorithm in the 2-3-4 tree elements is similar to the search algorithm in the 2-3 tree elements (the effectiveness of the search algorithm
in a 2-3 tree it has the order O (log n)), it can be said that the algorithm is the same, but it is extended. The bottom line is that, for example, to search an element containing the value 31 in the tree, it would be good to examine the left subtree of the root, since the number 31 is less than 37. Then the average subtree of the node <30 35> is examined, because the number 31 lies between 30 and 35. The search ends with a pointer to the left child node of the node <32 33 34>, since the number 31 is less than 32. As a result, we conclude that there is no element in the tree containing the value 31.



Insert in 2-3-4 tree


The algorithm for inserting an element in a 2-3-4 tree differs from the algorithm for inserting an element in a 2-3 tree only those nodes that contain 4 elements are separated immediately after detection. In tree 2-3, the search algorithm ran along the path from root to leaf, and then came back, dividing the nodes. To avoid this return, the algorithm for inserting an element into a 2-3-4 tree divides the four-place nodes immediately upon detection from the root to the leaf. The search for a position to insert starts from the root, which is a four-place node <10 30 60>. Moving the number 30 up, we divide it into three parts. Since this node is a root, you need to create a new root, put the number 30 in it and attach two child nodes to it. We continue the search for the number 20, checking the left subtree of the root, since the number 20 is less than 30.



Remove node from 2-3-4 tree


The deletion algorithm consists in the fact that it first searches for the node containing the specified element. Then we are looking for a symmetric successor to this node. Then we swap them with the element so that the deletion is always performed from the sheet. If the sheet contains 3 or 4 elements, we simply remove the desired element from it. The main difference is that in the 2-3-4 tree we do not need to go back to the root and rebuild the tree, as we do in the 2-3 tree.



I would like to finish my story with the words of my predecessor : “When writing a course project on this topic, I understood that there was practically no information in Runet, so I decided to deposit my couple of kopecks in this case, telling the interested community about this data structure.”

PS: while writing, I scooped knowledge from the book Data Abstraction and Problem Solving in C ++. Walls and Mirrors »3rd edition, authors: Frank M. Carrano, Janet J. Pritchard, I advise everyone to read it and thank you very much, readers, for the attention paid to the material.

Source: https://habr.com/ru/post/273687/


All Articles