Expressive JavaScript: Higher Order Functions

Content

Tsu-li and Tsu-su boasted the size of their new programs. "Two hundred thousand lines," said Tsu-li, "not counting the comments!" Tsu-su answered: "Pf-f, mine is almost a million lines." Master Yun-Ma said: "My best program takes five hundred lines." Hearing this, Tsu-li and Tsu-su experienced enlightenment.

Master Yun-Ma, Programming Book
')
There are two ways to build programs: make them so simple that there will obviously be no errors, or so complex that there will not be obvious errors.

Anthony Hoare, 1980 lecture at the Turing Award

A large program is an expensive program, and not only because of the time it was written. Large size usually means complexity, and complexity confuses programmers. Confused programmers make program errors. A large program means that bugs have a place to hide and they are harder to find.

Let us briefly return to the two examples from the introduction. The first is self-sufficient, and takes six lines.

var total = 0, count = 1; while (count <= 10) { total += count; count += 1; } console.log(total);

The second is based on two external functions and occupies one line.

 console.log(sum(range(1, 10)));

Which one of them will most likely encounter an error?

If we add the size of the sum and range definitions, the second program will also be large — more than the first. But I still say that it is likely to be correct.

This will be because the decision expression is directly related to the problem being solved. The summation of a numeric gap is not cycles and counters. These are sums and gaps.

The definitions of this dictionary (functions sum and range) will include cycles, counters, and other random details. But because they express simpler concepts than the entire program, they are easier to do correctly.

Abstractions

In the software context, these “dictionary” definitions are often called abstractions. Abstractions hide the details and give us the opportunity to talk about tasks at a higher or more abstract level.

Compare two pea soup recipes:

Add one cup of dry peas per serving to the container. Add water so that it covers the peas. Leave it that way for at least 12 hours. Remove the peas from the water and place them in the pan. Add 4 cups of water per serving. Close the pan and simmer the peas for two hours. Take half a bulb per serving. Cut into pieces with a knife, add to peas. Take one stalk of celery one by one. Cut into pieces with a knife, add to peas. Take a carrot for a portion. Cut into pieces with a knife, add to peas. Cook for another 10 minutes.

Second recipe:

Per serving: 1 cup of dry peas, half an onion, celery stalk, carrot.
Soak peas for 12 hours. Stew 2 hours in 4 cups of water per serving. Cut and add vegetables. Cook for another 10 minutes.

The second is shorter and simpler. But you need to know a little more concepts related to cooking - soaking, stewing, cutting (and vegetables).

Programming, we can not expect that all the necessary words will be in our dictionary. Because of this, you can slide down to the first-recipe template: dictating the computer all the small steps one after another, without noticing the higher-level concepts that they express.

The second kind of programmer should be the ability to notice when a concept begs for a new word for it and brought into abstraction.

Abstracting array traversal

Simple functions that we used before are good for building abstractions. But sometimes they are not enough.

In the previous chapter, we encountered the following cycle several times:

 var array = [1, 2, 3]; for (var i = 0; i < array.length; i++) { var current = array[i]; console.log(current); }

The code tries to say: “for each element in the array, output it to the console”. But it uses a workaround — with a variable to calculate i, checking the length of the array, and declaring the additional variable current. Not only is he not very handsome, he is also the ground for potential mistakes. We can accidentally reuse the variable i, instead of length write lenght, confuse the variables i and current, etc.

Let's abstract it into a function. Can you think of a way to do this?

It's pretty easy to write a function, traversing an array and calling for each console.log element.

 function logEach(array) { for (var i = 0; i < array.length; i++) console.log(array[i]); }

But what if we need to do something different than output elements to the console? Since “doing something” can be represented as a function, and functions are just variables, we can pass this action as an argument:

 function forEach(array, action) { for (var i = 0; i < array.length; i++) action(array[i]); } forEach(["", "", ""], console.log); // →  // →  // →

Often you can not transfer a predefined function in forEach, but create a function on the spot.

 var numbers = [1, 2, 3, 4, 5], sum = 0; forEach(numbers, function(number) { sum += number; }); console.log(sum); // → 15

It looks like a classic for loop, with the body of a loop written in a block. However, the body is now inside the function, and also inside the forEach call brackets. Therefore, it must be closed with both curly and parentheses.

Using this template, we can set the variable name for the current element of the array (number), without having to manually select it from the array.

In general, we do not even need to write forEach. This is the standard array method. Since the array is already passed as a variable on which we are working, forEach takes only one argument — a function that needs to be executed for each element.

To demonstrate the convenience of this approach, let us return to the function from the previous chapter. It contains two loops running through the arrays:

 function gatherCorrelations(journal) { var phis = {}; for (var entry = 0; entry < journal.length; entry++) { var events = journal[entry].events; for (var i = 0; i < events.length; i++) { var event = events[i]; if (!(event in phis)) phis[event] = phi(tableFor(event, journal)); } } return phis; }

Using forEach we record slightly shorter and much cleaner.

 function gatherCorrelations(journal) { var phis = {}; journal.forEach(function(entry) { entry.events.forEach(function(event) { if (!(event in phis)) phis[event] = phi(tableFor(event, journal)); }); }); return phis; }

Higher order functions

Functions that operate on other functions — either taking them as arguments or returning them — are called higher order functions . If you already understand that functions are just variables, there is nothing special about the existence of such functions. The term originates from mathematics, where differences between functions and other meanings are perceived more strictly.

Higher-order functions allow us to abstract actions, not just values. They are different. For example, you can make a function that creates new functions.

 function greaterThan(n) { return function(m) { return m > n; }; } var greaterThan10 = greaterThan(10); console.log(greaterThan10(11)); // → true

You can make a function that changes other functions.

 function noisy(f) { return function(arg) { console.log("calling with", arg); var val = f(arg); console.log("called with", arg, "- got", val); return val; }; } noisy(Boolean)(0); // → calling with 0 // → called with 0 - got false

You can even make functions that create new types of flow control.

 function unless(test, then) { if (!test) then(); } function repeat(times, body) { for (var i = 0; i < times; i++) body(i); } repeat(3, function(n) { unless(n % 2, function() { console.log(n, "is even"); }); }); // → 0 is even // → 2 is even

The rules of lexical scopes that we discussed in Chapter 3 work for our benefit in such cases. In the last example, the variable n is the argument of the external function. Since the inner function lives in the outer environment, it can use n. The bodies of such internal functions have access to the variables surrounding them. They can play the role of {} blocks used in ordinary cycles and conditional expressions. An important difference is that variables declared inside internal functions do not fall into the external environment. And usually this is only for the better.

Passing arguments

The noisy function, previously declared, which passes its argument to another function, is not quite convenient.

 function noisy(f) { return function(arg) { console.log("calling with", arg); var val = f(arg); console.log("called with", arg, "- got", val); return val; }; }

If f takes more than one parameter, it only gets the first one. It would be possible to add a bunch of arguments to the inner function (arg1, arg2, etc.) and pass all of them to f, but you don’t know how much is enough for us. In addition, the f function could not work correctly with arguments.length. Since we would always pass the same number of arguments, it would not be known how many arguments we were given initially.

For such cases, functions in JavaScript have a apply method. An array is passed to it (or an object in the form of an array) from the arguments, and it calls a function with these arguments.

 function transparentWrapping(f) { return function() { return f.apply(null, arguments); }; }

This function is useless, but it demonstrates the pattern of interest to us — the function it returns returns to f all its arguments, but nothing more. This happens by passing its own arguments, which are stored in the arguments object, to the apply method. The first argument of the apply method, which we assign null to in this case, can be used to emulate a method call. We will return to this issue in the next chapter.

Json

Higher-order functions that somehow apply a function to array elements are widely distributed in JavaScript. The forEach method is one of the most primitive similar functions. As array methods, we have many other functions available. To get to know them, let's play with a different set of data.

Several years ago, someone examined a lot of archives and made a whole book on the history of my last name. I opened it, hoping to find there knights, pirates and alchemists ... But it turned out that it was filled mainly with Flemish farmers. For fun, I extracted information on my immediate ancestors and designed it in a format suitable for reading by a computer.

The file looks like this:

 [ {"name": "Emma de Milliano", "sex": "f", "born": 1876, "died": 1956, "father": "Petrus de Milliano", "mother": "Sophia van Damme"}, {"name": "Carolus Haverbeke", "sex": "m", "born": 1832, "died": 1905, "father": "Carel Haverbeke", "mother": "Maria van Brussel"}, …    ]

This format is called JSON, which means JavaScript Object Notation (markup of JavaScript objects). It is widely used in data storage and network communications.

JSON is similar to JavaScript in its way of writing arrays and objects — with some limitations. All property names must be enclosed in double quotes, and only simple values are allowed - no function calls, variables, nothing that would include calculations. Also comments are not allowed.

JavaScript provides JSON.stringify and JSON.parse functions that convert data from and to this format. The first one takes a value and returns a string with JSON. The second accepts such line and returns value.

 var string = JSON.stringify({name: "X", born: 1980}); console.log(string); // → {"name":"X","born":1980} console.log(JSON.parse(string).born); // → 1980

The variable ANCESTRY_FILE is available here . It contains the JSON file as a string. Let's decode it and count the number of people mentioned.

 var ancestry = JSON.parse(ANCESTRY_FILE); console.log(ancestry.length); // → 39

We filter an array

To find people who were young in 1924, the following function may be useful. It filters out the elements of the array that are not tested.

 function filter(array, test) { var passed = []; for (var i = 0; i < array.length; i++) { if (test(array[i])) passed.push(array[i]); } return passed; } console.log(filter(ancestry, function(person) { return person.born > 1900 && person.born < 1925; })); // → [{name: "Philibert Haverbeke", …}, …]

An argument named test is used - this is a function that performs computation checks. It is called for each element, and the value returned by it determines whether the element is in the returned array.

The file turned out to be three people who were young in 1924 — a grandfather, grandmother, and great-aunt.

Please note - the filter function does not remove elements from an existing array, but builds a new one that contains only validated elements. This is a pure function, because it does not spoil the array passed to it.

Like forEach, filter is one of the standard array methods. In the example we described such a function, only to show what it does inside. From now on we will use it simply:

 onsole.log(ancestry.filter(function(person) { return person.father == "Carel Haverbeke"; })); // → [{name: "Carolus Haverbeke", …}]

Transformations with map

Suppose we have an archive of objects representing people, which was obtained by filtering an array of ancestors. But we need an array of names that would be easier to read.

The map method converts an array by applying a function to all its elements and constructing a new array of return values. The new array will have the same length as the input, but its contents will be converted to the new format.

 function map(array, transform) { var mapped = []; for (var i = 0; i < array.length; i++) mapped.push(transform(array[i])); return mapped; } var overNinety = ancestry.filter(function(person) { return person.died - person.born > 90; }); console.log(map(overNinety, function(person) { return person.name; })); // → ["Clara Aernoudts", "Emile Haverbeke", // "Maria Haverbeke"]

Interestingly, people who have lived to at least 90 years old are the very ones that we saw earlier that were young in the 1920s. This is the newest generation in my notes. Apparently, medicine has seriously improved.

Like forEach and filter, map is also a standard method for arrays.

Summation with reduce

Another popular example of working with arrays is to get a single value based on the data in the array. One example is the familiar summation of the list of numbers. Another is the search for a person born before everyone else.

A higher order operation of this type is called reduce (shrinking; or sometimes fold, folding). You can think of it as a folding array, one element at a time. When summing the numbers, we started from zero, and for each element we combined it with the current sum using addition.

The parameters of the reduce function, except for the array, are a combining function and an initial value. This function is a little less clear than filter or map, so pay close attention to it.

 function reduce(array, combine, start) { var current = start; for (var i = 0; i < array.length; i++) current = combine(current, array[i]); return current; } console.log(reduce([1, 2, 3, 4], function(a, b) { return a + b; }, 0)); // → 10

The standard method of reduce arrays, which of course works the same way, is even more convenient. If the array contains at least one element, you can omit the start argument. The method will take the first element of the array as the starting value and start working with the second one.

To find the oldest of my known ancestors with reduce, we can write something like:

 console.log(ancestry.reduce(function(min, cur) { if (cur.born < min.born) return cur; else return min; })); // → {name: "Pauwels van Haverbeke", born: 1535, …}

Composability

How could we write the previous example (searching for a person with the earliest date of birth) without higher order functions? In fact, the code is not so terrible:

 var min = ancestry[0]; for (var i = 1; i < ancestry.length; i++) { var cur = ancestry[i]; if (cur.born < min.born) min = cur; } console.log(min); // → {name: "Pauwels van Haverbeke", born: 1535, …}

Slightly more variables, two lines longer - but so far quite clear code.

Higher-order functions reveal their true potential when you have to combine functions. For example, let's write a code that finds the average age of men and women in the set.

 function average(array) { function plus(a, b) { return a + b; } return array.reduce(plus) / array.length; } function age(p) { return p.died - p.born; } function male(p) { return p.sex == "m"; } function female(p) { return p.sex == "f"; } console.log(average(ancestry.filter(male).map(age))); // → 61.67 console.log(average(ancestry.filter(female).map(age))); // → 54.56

(It’s foolish that we have to define addition as a function of plus, but operators in JavaScript are not values, so you don’t pass them as arguments).

Instead of entangling the algorithm in a large cycle, everything is distributed according to the concepts that interest us - the definition of gender, the calculation of age and the averaging of numbers. We use them in turn to get the result.

For writing clear code, this is a really fabulous opportunity. Of course, clarity does not come for free.

Price

In the happy edge of an elegant code and beautiful rainbows lives gadsky monster named Inefficiency.

The program that processes the array is most beautifully represented as a sequence of clearly separated steps, each of which does something with the array and returns a new array. But the layering of all these intermediate arrays is expensive.

Similarly, the transfer function in forEach, so that she went through the array for us, convenient and easy to understand. But calling functions in JavaScript is more expensive than cycles.

The same is true of many techniques that improve the readability of programs. Abstractions add layers between clean computer work and the concepts we work with — and as a result, the computer does more work. This is not an irony rule - there are languages that allow you to add abstractions without degrading performance, and even in JavaScript an experienced programmer can find ways to write abstract and fast code. But this problem is common.

Fortunately, most computers are insanely fast. If your data set is not too large, or the work time should be just fast enough from the point of view of a person (for example, to do something every time the user presses a button) - then it doesn't matter if you wrote a beautiful solution that works half a millisecond, or very optimized, which runs one tenth of a millisecond.

It is convenient to calculate approximately how often this piece of code will be called. If you have a loop in the loop (directly, or through a call in the loop of a function that also works with the loop inside), the code will be executed N * M times, where N is the number of repetitions of the outer loop, and M is the inner one. If there is another cycle in the inner loop, repeated P times, then we will get N * M * P - and so on. This can lead to large numbers, and when the program slows down, the problem can often be reduced to a small piece of code inside the innermost loop.

Great-great-great-great-great ...

My grandfather, Philibert Haverbeke, is mentioned in the data file. Starting with him, I can track my race in search of the oldest of ancestors, Pauvels van Haverbeke, my direct ancestor. Now I want to calculate what percentage of the DNA I have from him (in theory).

To go from the name of the ancestor to the object that represents it, we build an object that matches names and people.

 var byName = {}; ancestry.forEach(function(person) { byName[person.name] = person; }); console.log(byName["Philibert Haverbeke"]); // → {name: "Philibert Haverbeke", …}

The task is not just to find a father for each of the records and calculate how many steps it takes to Pauvels. In the history of the family, there were several marriages between cousins (well, small villages, etc.). In this regard, the branches of the family tree in some places are connected with others, so I get more genes than 1 / 2G (G is the number of generations between Pauvels and me). This formula is based on the assumption that each generation splits the genetic foundation in two.

It is reasonable to draw an analogy to reduce, where the array is reduced to a single value by sequentially combining the data from left to right. Here we also need to get a single number, but at the same time it is necessary to follow the lines of heredity. And they form not a simple list, but a tree.

We consider this value for a particular person by combining these values of his ancestors. This can be done recursively. If we need a person, we need to calculate the required value for his parents, which in turn requires calculating it for its ancestors, etc.In theory, we will have to go around an infinite number of tree nodes, but since our data set is finite, we will need to stop somewhere. We will simply assign a default value to all people who are not on our list. It would be logical to assign them a zero value - people who are not on the list do not carry the DNA of the ancestor we need.

By accepting a person, a function to combine values from two ancestors and a default value, the reduceAncestors function “condenses” the value from the family tree.

 function reduceAncestors(person, f, defaultValue) { function valueFor(person) { if (person == null) return defaultValue; else return f(person, valueFor(byName[person.mother]), valueFor(byName[person.father])); } return valueFor(person); }

The internal function valueFor works with one person. Thanks to recursive magic, she can summon herself for the treatment of the father and mother of this person. The results along with the person object are passed to f, which calculates the desired value for this person.

Now we can use this to calculate the percentage of DNA that my grandfather shared with Pauwels van Haverbeke, and divide it into four.

 function sharedDNA(person, fromMother, fromFather) { if (person.name == "Pauwels van Haverbeke") return 1; else return (fromMother + fromFather) / 2; } var ph = byName["Philibert Haverbeke"]; console.log(reduceAncestors(ph, sharedDNA, 0) / 4); // → 0.00049

A man named Pouvels van Haverbeke apparently shares 100% of the DNA with Pauvels van Haverbeke (there are no complete names in the list), so the function returns 1 for him. All others share the average percentage of their parents.

Statistically, about 0.05% of my DNA coincides with my ancestor from the 16th century. This, of course, is an approximate number. This is quite small, but since our genetic material is about 3 billion base pairs, there is something in me from my ancestor.

One could count this number without using reduceAncestors. But the separation of the general approach (bypassing the tree) and the specific case (DNA counting) allows us to write more understandable code and use parts of the code again for other tasks. For example, the following code finds out the percentage of the known ancestors of this person who lived to 70 years.

 function countAncestors(person, test) { function combine(person, fromMother, fromFather) { var thisOneCounts = test(person); return fromMother + fromFather + (thisOneCounts ? 1 : 0); } return reduceAncestors(person, combine, 0); } function longLivingPercentage(person) { var all = countAncestors(person, function(person) { return true; }); var longLiving = countAncestors(person, function(person) { return (person.died - person.born) >= 70; }); return longLiving / all; } console.log(longLivingPercentage(byName["Emile Haverbeke"])); // → 0.145

No need to treat such calculations too seriously, since our set contains an arbitrary sample of people. But the code demonstrates that reduceAncestors is a useful part of a common vocabulary for working with a family tree type data structure.

Binding

The bind method, which all functions have, creates a new function that will call the original one, but with some fixed arguments.

The following example shows how this works. In it, we define the function isInSet, which tells whether there is a name for a person in a given set. To call filter, we can either write an expression with a function that calls isInSet, passing it a rowset as the first argument, or use the isInSet function partially.

 var theSet = ["Carel Haverbeke", "Maria van Brussel", "Donald Duck"]; function isInSet(set, person) { return set.indexOf(person.name) > -1; } console.log(ancestry.filter(function(person) { return isInSet(theSet, person); })); // → [{name: "Maria van Brussel", …}, // {name: "Carel Haverbeke", …}] console.log(ancestry.filter(isInSet.bind(null, theSet))); // → … same result

A bind call returns a function that calls isInSet with the first argument of theSet, and subsequent arguments are the same as were passed to bind.

The first argument, which is now set to null, is used for method calls - just as it was in apply. We will talk about it later.

Total

The ability to transfer function calls to other functions is not just a toy, but a very useful feature of JavaScript. We can write expressions “with spaces” in them, which will then be filled with the values returned by the functions.

Arrays have several useful higher-order methods — forEach to do something with each element, filter — to build a new array, where some values are filtered, map — to build a new array, each element of which is passed through a function, reduce - for a combination all array elements in one value.

Functions have a apply method to pass arguments to them as an array. They also have a bind method for creating a copy of a function with partial arguments.

Exercises

Convolution

reduce concat , .

 var arrays = [[1, 2, 3], [4, 5], [6]]; //    // → [1, 2, 3, 4, 5, 6]

, ( ). average, .

– , , . byName, .

 function average(array) { function plus(a, b) { return a + b; } return array.reduce(plus) / array.length; } var byName = {}; ancestry.forEach(function(person) { byName[person.name] = person; }); //    // → 31.2

, 90 . . . , , 100 : Math.ceil(person.died / 100).

 function average(array) { function plus(a, b) { return a + b; } return array.reduce(plus) / array.length; } //    // → 16: 43.5 // 17: 51.2 // 18: 52.8 // 19: 54.8 // 20: 84.7 // 21: 94

groupBy, . , , , .

Every some

every some. , , , true false. , && true, true, every true, true . , some true, true . , – , some true , .

every some, , , .

 //    console.log(every([NaN, NaN, NaN], isNaN)); // → true console.log(every([NaN, NaN, 4], isNaN)); // → false console.log(some([NaN, 3, 4], isNaN)); // → true console.log(some([2, 3, 4], isNaN)); // → false

Source: https://habr.com/ru/post/241155/

All Articles