Finger Trees (Part 1. Introduction)

An article on Habré has recently been released on how you can create such structures in a functional language as the Queue (first went in, first came out) and Dec (reminiscent of a two-sided stack - first went in, the first came from both ends). I looked at this code and realized that it was terribly inefficient - the complexity of the order of O(n) . Quickly figure out how to create structures with O(1) did not work for me, so I discovered the code of the library implementation. But there was not an easy and understandable implementation, but < > . It was a description of finger trees, the need and elegance of which for this data structure is well revealed by the current article.

Finger Trees

In this article we will look at finger trees. These are functional immutable general-purpose data structures developed by Hinze and Patterson. Finger trees provide a functional data structure Sequence ( sequence ), which provides a depreciated access constant in time for adding both to the beginning and to the end of the sequence, as well as logarithmic time for concatenation and for random access. In addition to good asymptotic execution times, the data structure is incredibly flexible: in combination with monoidal tags on elements, finger trees can be used to implement efficient random access sequences, ordered sequences, interval trees, and priority queues.

The article will consist of 3 parts:
')
Finger Trees (Part 1. Introduction)
Finger Trees (Part 2. Operations)
Finger Trees (Part 3. Application)

Developing a data structure

The basis and motivation of finger trees came from 2-3 trees. 2-3 trees are trees that can have two or three branches at each inner vertex and that have all their leaves at the same level. While a binary tree of the same depth d must be 2 ^d leaves, 2-3 trees are much more flexible and can be used to store any number of elements (the number should not be a power of two).
Consider the following 2-3 tree:

This tree stores fourteen items. Access to any of them requires three steps, and if we were to add more elements, the number of steps for each of them would grow logarithmically. We would like to use these trees to model the sequence. However, in many applicable sequences very often and repeatedly refer to the beginning or to the end, and much less often to the middle. To satisfy this wish, we can change this data structure so that the priority of access to the beginning and end is the highest, unlike other features.

In our case, we add two fingers. The finger is just the point at which you can access parts of the data structure, in imperative languages this would be just a pointer. In our case, however, we will restructure the entire tree and make the parents of the first and last children the two roots of our tree. Visually, considering the issue of changing the tree above, we grab the first and last nodes on the penultimate layer, and pull them up, letting the rest of the tree hang down:

This new data type is known as the finger tree. The finger tree consists of several layers (circled in blue) that are strung on the axis shown in brown

Each layer of the finger tree has a prefix (on the left) and a suffix (on the right), as well as a link to the further trip deep into. The prefix and suffix contain the values of the finger tree: on the first layer there are 2-3 trees of depth 0, on the second layer there are 2-3 trees of depth 1, on the third layer they contain 2-3 trees of depth 2 and so on. The main element of this 2-3 tree is now the element at the bottom.

Well, having the description, let's describe the data structure. To begin with, we need to define a 2-3 tree structure that will need to be used to save things strung on an axis.

 data Node a = Branch3 aaa --  (node)   3 . | Branch2 aa -- ...   2 . deriving Show

Note that the branch is parameterized by its children. This allows you to have nested branches for the representation of 2-3 trees and to ensure the same depth. For example, a 2-3 tree with a depth of 1 may be a Node Char :

 > -- 2-3   2     . > Branch3 'n' 'o' 't' Branch3 'n' 'o' 't' > Branch2 'a' 't' Branch2 'a' 't'

However, we can also create deeper 2-3 trees. For example, a 2-3 tree, depth 2 may be a Node (Node Char) :

 > Branch2 (Branch3 'n' 'o' 't') (Branch2 'a' 't') Branch2 (Branch3 'n' 'o' 't') (Branch2 'a' 't')

Note that the mapping ensures that a 2-3 tree will have the same depth, because the depth is present in the tree type. It also has its drawbacks, since it is more difficult to write functions that are parametric in the tree depth parameter. But this is not so bad for our case.
For our further convenience, let's add some methods that allow us to convert branch values from lists of length 2 or 3. For this, we will use the OverloadedLists extension for GHC , which will allow us to write fromList and toList for various data types, and then use them for comparisons with the sample, if we use lists:

 {- LANGUAGE OverloadedLists, TypeFamilies -} import GHC.Exts (IsList(..)) instance IsList (Node a) where type Item (Node a) = a toList (Branch2 xy) = [x, y] toList (Branch3 xyz) = [x, y, z] fromList [x, y] = Branch2 xy fromList [x, y, z] = Branch3 xyz fromList _ = error "Node must contain two or three elements"

Now that we have our type 2-3 tree, we also need a type to save the prefix and suffixes that are strung on the axis of the finger tree. If our finger tree is a complete analogy of 2-3 trees, then each of the very first prefixes and suffixes can have 2 or 3 elements, and the middle ones can have only 1 or 2 (because one of the links goes up one level along the axis). However, to reduce the information content, the requirement is relaxed for finger trees, and, instead, each prefix and suffix contain from 1 to 4 elements. More values can not be. We could allow the prefix and suffix to be stored as lists, but we will instead use more selective constructors, each of which is responsible for its correct number of elements:

 --    ,    --      1  4 data Affix a = One a | Two aa | Three aaa | Four aaaa deriving Show

Working with this type of data is not so convenient, so we will quickly add helpers that allow you to work with affixes as lists.

 --       instance IsList (Affix a) where type Item (Affix a) = a toList (One x) = [x] toList (Two xy) = [x, y] toList (Three xyz) = [x, y, z] toList (Four xyzw) = [x, y, z, w] fromList [x] = One x fromList [x, y] = Two xy fromList [x, y, z] = Three xyz fromList [x, y, z, w] = Four xyzw fromList _ = error "Affix must have one to four elements" --        --           affixPrepend :: a -> Affix a -> Affix a affixPrepend x = fromList . (x :) . toList affixAppend :: a -> Affix a -> Affix a affixAppend x = fromList . (++ [x]) . toList

Now that we have determined the type of data needed to store values (2-3 trees, preserving values and affixes attached to the axis), we can create an axial data structure. This axial structure is what we call the finger tree, and it is defined as follows:

 --  ,    ,       data FingerTree a = Empty --     | Single a --     ,       --    ,       | Deep { prefix :: Affix a, --   deeper :: FingerTree (Node a), --     ,  2-3  suffix :: Affix a --   } deriving Show

In the definition above, the deep field FingerTree a is of type FingerTree (Node a) . This means that the values stored on the next layer are 2-3 trees that are one level deeper. Thus, the affixes of the first layer FingerTree Char store only Char , the second layer stores FingerTree (Node Char) and has affixes that store 2-3 trees of depth 1 ( Node Char ). The third layer will be FingerTree (Node (Node Char)) and has affixes that store 2-3 trees of depth 2 Node (Node Char) ).

Now that we have defined our type of finger tree, we will spend a little more time and consider the example that was shown above in order to understand how we translate it into the FingerTree Char structure:

Translating it into a tree, we get:

 layer3 :: FingerTree a layer3 = Empty layer2 :: FingerTree (Node Char) layer2 = Deep prefix layer3 suffix where prefix = [Branch2 'i' 's', Branch2 'i' 's'] suffix = [Branch3 'n' 'o' 't', Branch2 'a' 't'] layer1 :: FingerTree Char layer1 = Deep prefix layer2 suffix where prefix = ['t', 'h'] suffix = ['r', 'e', 'e'] exampleTree :: FingerTree Char exampleTree = layer1

 > exampleTree Deep {prefix = Two 't' 'h', deeper = Deep {prefix = Two (Branch2 'i' 's') (Branch2 'i' 's'), deeper = Empty, suffix = Two (Branch3 'n' 'o' 't') (Branch2 'a' 't')}, suffix = Three 'r' 'e' 'e'}

In the next part of the article, we will learn how to easily work with finger trees as sequences.

Source: https://habr.com/ru/post/240783/

All Articles

Finger Trees (Part 1. Introduction)

Finger Trees

Developing a data structure

More articles: