📜 ⬆️ ⬇️

Pattern-matching (one more) in coffeescript

Introduction


Once I was sitting and looking sadly at the Erlang code written as part of a study. I really wanted to write something more useful than tic-tac-toe on it, but as luck would have it, no suitable tasks occurred to me. But there is JavaScript, in which there are functions of the first order, and currying, and map / filter / fold, and, most importantly, the task of inventing much easier. But pattern matching doesn't exist. A quick search gave me a few libraries, but the syntax suggested by them seemed heavy to me. Is it possible to make it more concise, closer to the native Erlang syntax?

Spoiler: you can, if you take a coffeescript, do this:

fn = Match -> [ When {command: “draw”, figure: @figure = {type: “circle”, radius: @radius}}, -> console.log(@figure, @radius) When {command: “draw”, figure: @figure = {type: “polygon”, points: [@first, @second | @rest]}}, -> console.log(@figure, @first, @second, @rest); ] fn {command: “draw”, figure: {type: “circle”, radius: 5, color: “red”}} #output: {type: “circle”, radius: 5, color: “red”} 5 

Who cares how it happened - welcome under cat.

Short description


Actually, what is happening here, if in brief:
  1. Match accepts a function that returns an array of patterns (patterns) and their corresponding actions;
  2. When you call, the context is replaced, so all these & commat; a (== this.a) point to specially prepared properties (properties) that allow the parser to figure out which values ​​to bind to;
  3. Further, when calling with a specific value, a comparison with the template is made (for the time being, we can assume that there is a recursive detour of the template and value and comparison of specific values, a little more about it will be discussed below);
  4. If the value matches the pattern, then the function-action is called. In her, we also change the context, substituting the appropriate values.

So, if we take the example above, then com commat; radius in the first argument When indicates which part of the input object should be removed (in this case, .figure.radius), and in the second argument (function) it also contains a specific value.
')

Work with arrays


Erlang has a convenient syntax for splitting a list into a head (the first element) and a tail (everything else), which is widely used for various recursive algorithms.

 case List of [Head | Tail] -> Head; [] -> {empty}; end. 

In javascript (and coffeescript) there is no possibility to redefine operators, therefore you can only do something like regular tools using regular tools:

 When [@head, @tail…], -> console.log(@head, @tail) 

In principle, not bad, but in erlang somehow nicer. Can it still somehow be possible, at least for a number of scenarios?
Here it is worth remembering how generally javascript manages to perform an operation like:

 var object1 = {x:1}, object2 = {y: 2}; console.log(object1 | object2); 

And get 0. This works as javascript tries to reduce the type to a numeric type, for which it calls the valueOf method on the object. If we change the method of our objects and return the power of two, then as a result you can find out which objects the operation was applied to.

 var object1 = {x:1}, object2 = {y: 2}, object3 = {z: 3}; object1.valueOf = function() { return 2; } object2.valueOf = function() { return 4; } object3.valueOf = function() { return 8; } console.log(object1 | object2); //6 == 2 | 4 == object1 and object2 

A bold assumption was made that very rarely would anyone use arrays of concrete numbers in patterns, so if the parser encounters a number at the end of the array, it tries to determine if this is the result of an operation or two objects. As a result, it became possible to write like this:

 When [@head | @tail], -> console.log(@head, @tail) 

It looks nice, but outside of this task I would not use this method everywhere.

Class mapping


The next thing I wanted was to make a comparison of the structures as in Erlang, with an indication of the type and content.

 #person{name = Name} 

Right so, of course, will not succeed, but something similar can be done. In the end, I settled on three solutions:

 When ObjectOf(Point1, {x: 1, y: 2}), -> … When Point2(x:1, y:2), -> … When Point3$(x:1, y:2), -> ... 

The first one works out of the box, the second one looks almost like a case class on scala, but it requires inserting such a string into the constructor.

 class Point2 constructor: (@x, @y) -> return m if m = ObjectOf(@, Point2, arguments) 

This is needed to understand whether a function was called as a constructor or not, the constructor itself and the arguments fall into the template.

The third option is a variation on the first one; we simply prepare the function in advance:

 Point3$ = ObjectOf(Point3) 

Performance


The first naive version performed a comparison of the pattern and values, passing them recursively. Basically, I expected the performance to be not up to par when compared with a simple parsing of an object. But it was worth checking out.

Manual analysis
 coffeeDestruct = (demo) -> {user} = demo return if not user.enabled {firstname, group, mailbox, settings} = user return if group.id != "admin" notifications = settings?.mail?.notify ? [] return if mailbox?.kind != 'personal' mailboxId = mailbox?.id ? null {unreadmails, readmails} = mailbox; return if unreadmails.length < 1 firstUnread = unreadmails?[0] ? [] restUnread = unreadmails?.slice(1) ? [] return if readmails?.length < 1 return if readmails?[0]?.subject != "Hello" rest = readmails?.slice(1) return {firstname, notifications, firstUnread, restUnread, rest, mailboxId} 


Template
 singlePattern = Match -> [ When {user: { firstname: @firstname, enabled: true, group: {id: "admin"}, settings: {mail: {notify: @notifications}}, mailbox: { id: @mailboxId, kind: "personal", unreadmails: [ @firstUnread | @restUnread ], readmails: [ {subject: "Hello"}, Tail(@rest) ] } }}, -> "ok" ] 



Results for 10,000 comparisons:

 Regular: 5ms
 Single Pattern: 140ms
 Multiple patterns: 429ms

Yes, not that I want to see in production. Why not convert the pattern to code close to the first example?

No sooner said than done. We go on a pattern and we make the list of conditions and intermediate variables.

An interesting feature came out here. The first version of the compiled function was almost identical to the handwritten parse, but the performance was worse than one and a half times. The difference was in the way the result object was created: it turned out that creating an object with the specified fields is cheaper than creating an empty object and filling it in later. For verification, I made this benchmark . After that, I found two articles on this topic - here and now - and also a Habré translation.

Spent the optimization, run:

 Regular: 5ms
 Single Pattern: 8ms
 Multiple patterns: 164ms

The second number looks good, but what about the third and why is it still big? The third is a match expression with several templates, where only the last one is triggered. Since templates are compiled independently, we get a linear dependence on the number of templates.

But in reality, the templates will be very similar - we will analyze objects that differ in some details, and at the same time have a similar structure. Let's say here:

 fn = Match -> [ When ["wait", "infinity"], -> console.log("wait forever") When ["wait", @time = Number], -> console.log("wait for " + this.time + "s") ] 

In both cases, the array consists of two elements and the first is “wait”, the only difference is in the second. And the parser will do two almost identical functions and will call them one by one. Let's try to combine them.

The meaning is simple:
  1. Parser instead of the code will issue a chain of "commands";
  2. Then all the commands are taken and are gathered in one chain with the branches;
  3. Now the commands are turned into code.

It is worth noting that if we went into one chain, then in case of failure we should not go out, but try the next chain. I have seen three ways to achieve this:

1. Nested conditions

 if (Array.isArray(val)) { if (val.length === 2) { if (val[0] === 'wait') { if (val[1] === 'infinity') { return {when: 0}; } if (val[1].constructor.name === 'Number') { return {when: 1}; } } } } 

It looks awful, and even when generating the code it would not be confused in brackets. Not.

2. Nested functions

 if (!Array.isArray(val)) return false; if (val.length !== 2) return false; if (val[0] !== 'wait') return false; if (res = function fork1() { if (val[1] !== 'infinity') return false; return {when: 0} }()) return res; if (res = function fork2() { if (val[1].constructor.name !== 'Number') return false; return {when: 1}; }()) return res; 

Looks better. But they are straining additional checks and returns, since there is no way to return immediately from an external function (well, except through exceptions, but this is not serious).

3. Breaking bad label

 if (!Array.isArray(val)) return false; if (val.length !== 2) return false; if (val[0] !== 'wait') return false; fork1: { if (val[1] !== 'infinity') break fork1; return {when: 0} } fork2: { if (val[1].constructor.name !== 'Number') break fork2; return {when: 1}; } 

It looks good and to me, like an old sishnik, it immediately seemed that this option would be faster. A quick check on jsperf confirmed my guess. So on this option and stop.

Let's look at the performance:

 Regular: 5ms
 Single Pattern: 8ms
 Multiple patterns: 12ms

It is quite acceptable. Leave as is.

Architecture and Plugins


After adding another functionality by adding new if-s in two different places, I decided to rework the architecture. Now, instead of the two large functions parse and render, functions parse and render smaller appear, which themselves do not really do anything, but each part of the template is sent along a chain of plug-ins. Each plugin can:

For example, the plugin for comparison with the constructor looks like this:

 function pluginConstructor() { return { //        //    ,       parse_object //  parse,       parse_function: function(part, f) { //    "constructor" //        - .   f.addCheck("constructor", part.name); }, //    ""    "constructor" //  ,     if. render_constructor: function(command, varname) { return varname + ".constructor.name === " + JSON.stringify(command.value); } }; } 

This allowed, on the one hand, to simplify the addition of new features to myself, and on the other, to enable users to add their own plugin and extend the syntax of the templates. For example, you can add support for regular expressions so that you can write like this:

 fn = Match -> [ When @res = /(\d+):(\d+)/, -> hours: @res[1], mins: @res[2] # or When RE(/(\d+):(\d+)/, @hours, @min), -> hours: @hours, mins: @mins ] 

Comparison with other solutions


As I wrote at the very beginning, I tried to look for similar solutions, and this is what was found:

Performance comparison with matches.js, pun and manual analysis can be found here .

Conclusion


Here in general, that's all. The code itself can be found here . Despite the fact that the syntax is sharpened under coffeescript, the library itself is written in javascript and can be used in other languages ​​compiled in js.

A few cons in pursuit:
  1. Splitting arrays into “head” and “tail” is useful for recursive algorithms, but without optimization of tail recursion, there can be problems with performance and stack overflow on large volumes.
    Solution: not yet invented

  2. You cannot use functions in templates — or rather, you can use them, but they will be called only once when the template is compiled.
    Solution: use guards

  3. Because of this context substitution, it will not be possible to bind the action functions to your context. On the other hand, if we are writing in a functional style, we do not seem to need calls to the methods.
    Solution: the old fashioned way, self = this

  4. For the same reason, most likely it will not be possible to use the arrow-functions from ecmascript 6 - they tightly bind the context so that even calls via call / apply do not affect them.
    Solution: not yet invented

I hope something will be useful. Thanks for attention.

Source: https://habr.com/ru/post/253761/


All Articles