Remarkable zippers, or how I learned not to worry and fell in love with tree data structures

It is known that a tree is a rather complex structure. And if reading is successfully implemented, including recursion (which is not without its problems), then things are not doing well with the change.

At the same time, there has been a highly efficient tool for working with trees — zippers — for a long time, but it has not gained widespread use and, I think, I know why.

The classic conceptual explanation of the zipper looks something like this: this is a look from the inside at the tree-like structure, as if turned inside out, like an inverted glove .
')
This is a figurative explanation, if the creak of the brain, usually, of course, is understood only in part. Next, zippers are put aside, because "this is an incomprehensible kind of functional problem, such as monads, then I will figure it out . "

The author of the "later" has already arrived. This article is an attempt to provide an alternative explanation for zippers (not to be confused with an explanation for alternatively gifted, though ...) such that it will allow you to quickly understand and immediately start using zippers in practical tasks.

Consider how the idea evolved.

Suppose we have some sequence no matter what kind of data. Let it be a vector (array).

Here is a vector with symbol elements that we need to work with. Let us need to walk on it left and right and read the characters. Naturally, the idea of some “window” with a width of one element arises, which we can shift right and left.

In fact, we have a certain component-cursor for working with vectors, which we can move left and right and read data from the current position. The natural development of the idea will be the enrichment of this component with additional features. Let him, besides shifting and reading the current one, also tell us where we are:

We can go further and create this API of this component:

ShagLeft
Step Right
Present value
Position Left
PositionRight
ExtremeLeft ?
ExtremeRight ?
ReplaceCurrentValue
PasteValueLeft
PasteValue Right
Delete Current

Note that our component-cursor is in no way tied to vectors and symbols, that is, it can be used for any collections with items of any type. Very good component.

What about trees? Why not come up with something similar for trees? Easy!

In the case of a tree, new possibilities were added in a natural way: now we can walk not only left and right, but also up and down, and also determine whether we are at the root or in a leaf of the tree.
And, of course, immediately itching to enrich the API:

ShagLeft
Step Right
Step Up
Step Down
Present value
Root Value
Child Values
Position Left
PositionRight
Position Top
ExtremeLeft ?
ExtremeRight ?
Root ?
Leaf ?
ReplaceCurrentValue
PasteValueLeft
PasteValue Right
Delete Current

Ladies and gentlemen, let me introduce ... Zipper !

Obviously, the above API is not complete, naturally you need to add two methods for depth first search: Previous and Next , which will move the window forward and backward according to the search rules. You can add a method to add a more generic value for convenience. In general, we smoothly turn to implementation details, which we were not going to do.

The main thing is that we have felt the idea itself, which now seems very banal and natural, and such is.

And where is the notorious glove turned inside out?

Yes, that's it! If we replace the methods of Position Left , Position Right , Position Top on Values Left , Values Right , Values Top , then we get a “look at the tree from the inside”: there is a current value and

ValuesLeft
ValuesRight
Values Above
Child Values

What is not turned out glove?

You can proceed to practice.

But first, let us fill in one omission. It should be mentioned that zippers are a functional concept, that is, they are most effective in the environment of persistent data structures (roughly speaking, data does not change, but only new ones are created), functions as first-class objects (roughly speaking, functions can be passed in parameters) and that's all.

If your platform provides efficiently implemented persistent structures, then zippers are automatically obtained as efficient and low cost (worthless) components. They can be created and recreated according to need without worrying about overhead.

Our platform - clojure (script) - provides persistent structures. Moreover, it provides the zippers themselves: the namespace clojure.zip , I recommend to get acquainted with the source code - a simple, clean implementation.

Let's fill the second omission. In the case of a cursor for a vector, we noted that the cursor is not tied to either a vector or symbols, and can be used with any collections.

And what about zippers?

All the same! Conceptually, zippers are not tied to either structure or data, that is, they can be used on any trees. In other words, the zipper allows you to abstract the algorithm for processing a tree from its structure.

Clojure.zip, for example, provides us with the zipper function, which creates a zipper for our needs:

(zipper branch? children make-node root)

branch ? - function, according to the node passed to it, it is determined by the branch in principle or not (can it have children);
children - function, returns a collection of child nodes by the branch passed to it;
make-node - a function, on a node passed to it and a collection of children returns a branch of a tree, that is, a node with transferred children;
root is the root node, that is, our tree itself.

Using this function, we can create a zipper that works with our particular structure. Suppose we have such a small tree:

(def sample-tree {:tag :div :content [ {:tag :span :content ["  "]} {:tag :span :content ["!"]} ]})

Create a zipper:

 (require '[clojure.zip :as z]) ;

 (def sample-z (z/zipper (fn [node] (not (string? node))) ;      (fn [node] (vec(:content node))) ;  :content       ( ,  nil ) (fn [node children] (assoc node :content (vec children))) ;   :content     sample-tree)) ;

How do we get the full text of the tree?

 (loop [z sample-z result ""] ;    z  result (if (z/end? z) ;      result ;   (recur (z/next z) (if (z/branch? z) result (str result (z/node z)))))) ;

The result of the execution: “ Good morning! ”Note that the tree traversal is done iteratively, not recursively.

We would insert a comma with a space after the call. No sooner said than done!

 (def new-z (-> sample-z z/next (z/insert-right {:tag :span :content [", "]})))

new-z is a modified zipper. If we need the actual modified tree:

 (def new-tree (z/root new-z))

Although the basic API is implemented in the form of functions of the slojure.zip space, it can be useful to look into the zipper itself, and for this you need to understand what it is. In this implementation, this is simply a vector of two components: the current tree node and the map describing its position (the same glove) with the keys:

: l - nodes on the left
: r - nodes on the right
: pnodes - nodes above (path to root)
: ppath - a copy of the parent "gloves"
: changed? - a sign that the tree has been modified by this zipper

In our example, new-z will look like this:

 [{:content ["  "], :tag :span} ;   {:changed? true, :l [], :pnodes [{:content [{:content ["  "], :tag :span} {:content ["!"], :tag :span}], :tag :div}], :ppath nil, :r ({:content [", "], :tag :span} {:content ["!"], :tag :span})}]

By the way, all the examples from the article can be easily driven to the online REPL without bothering with deployment.

And a little bit of terminology: zipper is a concept, an idea. A specific instance (as new-z in our example) is usually called a location, which is very logical.

That's all. May this article help the suffering functional tree on his difficult path! Thanks for attention.

UPD: qnikst very correctly noted that the article lacks some reference to the theory, which is extremely useful in answering the practical question " how do we organize a zipper for a particular structure? ". Wouldn't it be great to be able to mathematically precisely select a zipper for a structure?
To do this, you need to be able to describe the data structure algebraically, and differentiate the resulting expression. The second is that everyone is able from school, but with the first one there can be a plug. By itself, this question seems to me extremely interesting, and the practical presentation draws on a separate article, which does not yet exist. Therefore, I am sending interested in the best material that I found on this topic in the network in Russian:
Data Algebra and Mutation Calculus (translation)

And one more link for completeness of the picture, for a more complete and more academic description of the concept of “algebraic data type”: the journal Practice of Functional Programming

Source: https://habr.com/ru/post/279623/

All Articles

Remarkable zippers, or how I learned not to worry and fell in love with tree data structures

More articles: