Measurement of productivity of functions in JavaScript

Performance has always played a key role in software. And in web applications, its value is even higher, because users can easily go to competitors if the site you have made is slow. Any professional web developer should keep this in mind. Today, it is still possible to successfully apply a lot of old methods of performance optimization, such as minimizing the number of requests, using a CDN and not using a blocking code for rendering. But the more developers use JavaScript, the more important the task of optimizing its code becomes.

Probably, you have some suspicions about the performance of the functions you use often. Perhaps you even figured out how to improve the situation. But how do you measure performance gains? How can you accurately and quickly test the performance of functions in JavaScript? The ideal option is to use the built-in performance.now() function and measure the time before and after the execution of your functions. Here we look at how this is done, as well as analyze a number of pitfalls.

Performance.now ()

The High Resolution Time API has a now() function that returns a DOMHighResTimeStamp object. This is a floating point number that reflects the current time in milliseconds, to the nearest thousandth of a millisecond. In itself, this number has little value for us, but the difference between the two measured values describes how much time has passed.
')
Besides the fact that this tool is more accurate than the built-in Date object, it is also “monotonous”. If in a simple way: it is not affected by the system time correction. That is, by creating two copies of the Date and calculating the difference between them, we will not get an accurate, representative idea of how much time has passed.

From the point of view of mathematics, the monotonous function either only increases or only decreases. Another example, for a better understanding: the transition to summer or winter time, when all the hours in the country are set back an hour or an hour ahead. If we compare the values of two copies of the Date - before and after the clock transfer, we get, for example, the difference "1 hour 3 seconds and 123 milliseconds." And when using two copies of performance.now() - "3 seconds 123 milliseconds 456,789 thousandths milliseconds." We will not analyze this API in detail here; those who wish can refer to the article Discovering the High Resolution Time API .

So now we know what the High Resolution Time API is and how to use it. Let us now consider some possible errors, but first let's write the function makeHash() , which will be used further in the text.

 function makeHash(source) { var hash = 0; if (source.length === 0) return hash; for (var i = 0; i < source.length; i++) { var char = source.charCodeAt(i); hash = ((hash<<5)-hash)+char; hash = hash & hash; // Convert to 32bit integer } return hash; }

The performance of such functions can be measured in the following way:

 var t0 = performance.now(); var result = makeHash('Peter'); var t1 = performance.now(); console.log('Took', (t1 - t0).toFixed(4), 'milliseconds to generate:', result);

If you execute this code in a browser, the result will look like this:

 Took 0.2730 milliseconds to generate: 77005292

Demo: codepen.io/SitePoint/pen/YXmdNJ

Mistake number 1: random measurement of unnecessary things

In the example above, you might have noticed that between the two performance.now() uses the function makeHash() , whose value is assigned to the variable result . So we calculate how long it took to execute this function, and nothing more. You can measure in this way:

 var t0 = performance.now(); console.log(makeHash('Peter')); // Bad idea! var t1 = performance.now(); console.log('Took', (t1 - t0).toFixed(4), 'milliseconds');

Demo: codepen.io/SitePoint/pen/PqMXWv

But in this case we would measure how long the call to the function makeHash('Peter') , as well as the duration of sending and outputting the result to the console. We do not know how long each of these operations takes, we only know their total duration. In addition, the speed of sending data and output to the console strongly depends on the browser and even on what else it does at that time. You probably think that console.log is unpredictably slow. But in any case, it will be an error to perform more than one function, even if each of the functions does not imply any input-output operations. For example:

 var t0 = performance.now(); var name = 'Peter'; var result = makeHash(name.toLowerCase()).toString(); var t1 = performance.now(); console.log('Took', (t1 - t0).toFixed(4), 'milliseconds to generate:', result);

Again, we don’t know which operation took the most time: assigning a value to a variable, calling toLowerCase() or toString() .

Mistake number 2: single measurement

Many conduct only one measurement, add up the total time and draw far-reaching conclusions. But the situation can change every time, because the speed of implementation depends greatly on such factors as:

compile time to bytecode (compiler warm-up time),
employment of the main process in performing other tasks
CPU utilization is something that inhibits the entire browser.

Therefore, it is better to perform not one measurement, but several:

 var t0 = performance.now(); for (var i = 0; i < 10; i++) { makeHash('Peter'); } var t1 = performance.now(); console.log('Took', ((t1 - t0) / 10).toFixed(4), 'milliseconds to generate');

Demo: codepen.io/SitePoint/pen/Qbezpj

The risk of this approach is that the browser-based JavaScript engine can perform sub-optimization, i.e., the function will be called a second time with the same input data that will be stored and used in the future. To get around this, you can use many different input lines instead of taking the same value over and over again. However, with different input data and the speed of the function execution may differ from time to time.

Mistake number 3: excessive confidence in average values

So, it is advisable to make a series of measurements to more accurately assess the performance of a particular function. But how to determine the performance of the function, if with different input data it is performed at different speeds? Let's first experiment and measure the execution time ten times with the same input data. The results will look something like this:

 Took 0.2730 milliseconds to generate: 77005292 Took 0.0234 milliseconds to generate: 77005292 Took 0.0200 milliseconds to generate: 77005292 Took 0.0281 milliseconds to generate: 77005292 Took 0.0162 milliseconds to generate: 77005292 Took 0.0245 milliseconds to generate: 77005292 Took 0.0677 milliseconds to generate: 77005292 Took 0.0289 milliseconds to generate: 77005292 Took 0.0240 milliseconds to generate: 77005292 Took 0.0311 milliseconds to generate: 77005292

Notice how the very first value differs from the rest. Most likely, the reason is just in the conduct of sub-optimization and the need to "warm up" the compiler. Little can be done to avoid this, but you can protect yourself from incorrect conclusions.

For example, you can exclude the first value and calculate the arithmetic average of the remaining nine. But it is better to take all the results and calculate the median . Results are sorted in order, and the average is selected. That's where performance.now() very useful, because you get a value with which you can do anything.

So let's measure again, but this time we use the median value of the sample:

 var numbers = []; for (var i=0; i < 10; i++) { var t0 = performance.now(); makeHash('Peter'); var t1 = performance.now(); numbers.push(t1 - t0); } function median(sequence) { sequence.sort(); // note that direction doesn't matter return sequence[Math.ceil(sequence.length / 2)]; } console.log('Median time', median(numbers).toFixed(4), 'milliseconds');

Mistake number 4: comparing functions in a predictable order.

Now we know that it is always better to take several measurements and take an average. Moreover, the last example suggests that, ideally, you should take the median instead of the average.

Runtime measurement is good to use to select the fastest function. Suppose we have two functions that use the same input data and produce the same results, but they work differently. Let's say we need to choose a function that returns true or false if it finds a particular string in the array, regardless of case. In this case, we cannot use Array.prototype.indexOf .

 function isIn(haystack, needle) { var found = false; haystack.forEach(function(element) { if (element.toLowerCase() === needle.toLowerCase()) { found = true; } }); return found; } console.log(isIn(['a','b','c'], 'B')); // true console.log(isIn(['a','b','c'], 'd')); // false

This code can be improved, since the haystack.forEach loop will haystack.forEach through all the elements, even if we quickly found a match. We use the good old for:

 function isIn(haystack, needle) { for (var i = 0, len = haystack.length; i < len; i++) { if (haystack[i].toLowerCase() === needle.toLowerCase()) { return true; } } return false; } console.log(isIn(['a','b','c'], 'B')); // true console.log(isIn(['a','b','c'], 'd')); // false

Now let's see which option is faster. Perform each function ten times and calculate the "correct" results:

 function isIn1(haystack, needle) { var found = false; haystack.forEach(function(element) { if (element.toLowerCase() === needle.toLowerCase()) { found = true; } }); return found; } function isIn2(haystack, needle) { for (var i = 0, len = haystack.length; i < len; i++) { if (haystack[i].toLowerCase() === needle.toLowerCase()) { return true; } } return false; } console.log(isIn1(['a','b','c'], 'B')); // true console.log(isIn1(['a','b','c'], 'd')); // false console.log(isIn2(['a','b','c'], 'B')); // true console.log(isIn2(['a','b','c'], 'd')); // false function median(sequence) { sequence.sort(); // note that direction doesn't matter return sequence[Math.ceil(sequence.length / 2)]; } function measureFunction(func) { var letters = 'a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z'.split(','); var numbers = []; for (var i = 0; i < letters.length; i++) { var t0 = performance.now(); func(letters, letters[i]); var t1 = performance.now(); numbers.push(t1 - t0); } console.log(func.name, 'took', median(numbers).toFixed(4)); } measureFunction(isIn1); measureFunction(isIn2);

We get the following result:

 true false true false isIn1 took 0.0050 isIn2 took 0.0150

Demo: codepen.io/SitePoint/pen/YXmdZJ

What does it mean? The first function was three times faster . It just can not be! The explanation is simple, but not obvious. The first function that uses haystack.forEach benefits from low-level optimization at the browser JS engine level, which is not done using an array index. So do not measure, do not know!

findings

Trying to demonstrate the accuracy of performance measurement in JavaScript using performance.now() , we found that our intuition could fail us: the empirical data did not completely coincide with our assumptions. If you want to write fast web applications, the JS code must be optimized. And since computers are practically living beings, they are still capable of being unpredictable and surprising us. So the best way to make your code faster is to measure and compare.

Another reason why we cannot know in advance which option will be faster is that it all depends on the situation. In the last example, we searched for a match among 26 values regardless of case. But if we search among 100,000 values, the choice of the function may be different.

The errors considered are not the only possible ones. To them, you can add, for example, the measurement of unrealistic scenarios or the measurement on only one JS engine. But it’s important to remember the most important thing: if you want to create fast web applications, then you’ll not find a better performance.now() tool. However, measuring the runtime is only one aspect. Performance is also affected by memory usage and code complexity.

Source: https://habr.com/ru/post/272087/

All Articles