Friday JS: random mixing

Exam in school ensigns.

- Look here. This is the thumb, this is the index finger, this is the middle finger, this is the ring finger, this is the little finger. We interfere, interfere, interfere (moves fingers) ... Now where is what?

Hello. From the orthodox point of view, today is not real Friday - just the day when tomorrow is a day off . Therefore, the article in my traditional category will also be not quite real, it has a low degree of madness and increased utility. However, enough prefaces, let's get to the point.

My students are regularly faced with the task of randomly mixing the array. Behind her decision, they, as a rule, climb into Google. And Google tells them the following:

var shuffledArr = arr.sort(function(){ return Math.random() - 0.5; });

Hereinafter we will call this method random sorting . Today I decided to write about the advantages and disadvantages of this approach.

How does this even work?

The sort method of javascript arrays takes as an argument a comparator function. This function should take two elements of the array and return a number. If the number is positive, the sorting algorithm assumes that the first element is larger; if negative, then the first element is smaller; if the comparator returns zero, then within the scope of this sorting, the elements are equal. If under the guise of a comparator we pass a function that will return a positive or negative number randomly, then the array will be sorted in a “random” order.

')

Benefits

Such mixing is written very quickly. I honestly tried to come up with some other advantage, but I failed.

disadvantages

This paragraph will be somewhat longer, so I will break it down into subparagraphs.

Specification Mismatch

The EcmaScript specification (I intentionally do not name the version, since in all versions this item remains approximately the same) tells us the following:

It is not a clear comparison and it is not a clear comparison.

Translated from technical English into Russian, this means that if a comparator does not meet some obvious requirements (the specification explains below, in particular, it must always return the same value for the same arguments), then the sort order depends on the implementation of a particular javascript engine. That is, in fact, not defined. The developer of the browser has, for example, the full right to make it so that when it detects that the comparator is not real, the original array is returned without any permutations. And it will fully comply with the specification. The fact that in all existing browsers the above method gives something similar to random mixing is nothing more than a happy coincidence.

Time complexity

The correct mixing algorithm (see below) has the time complexity O (n). Simply put, this means something like this: if the length of the array increases 10 times, the duration of its mixing will also increase 10 times.

The fastest sorting algorithms have the time complexity O (n * log (n)). This means that increasing the length of the array by 10 times the duration of its mixing will increase by more than 10 times.

In sum, these two facts mean this: for a sufficiently large array, mixing by random sorting will work slower than “correct” mixing (even if this was not the case for small arrays). And the larger the array, the greater the difference in runtime.

Why did I make a reservation in brackets? Because Array # sort is executed by native code, and due to this, it can potentially turn out to be faster on small arrays. It may have a smaller constant factor, a person familiar with O-notation would say.

Unnatural accident

Those who are at least superficially familiar with the theory of probability, know that chance is different. A coin can fall out of an eagle or a tail, a cube can fall out six or not six. Both there and there are random events, but in the first case the events are equally probable, and in the second there isn’t.

An array mixing is called truly random if all possible permutations of its elements have the same probability. Random sorting does not possess this property, and I will show it in practice.

I sketched the next page . There are two diagrams on it, one corresponds to random mixing, the second to random sorting. However, you now see not diagrams, but small squares, divided into cells of various shades of gray. In order for them to become diagrams, a legend is needed - an explanation of what these cells and their colors mean. Everything is very simple. Several times (in this case several = 10000) we take an array of numbers from 0 to n (in this case n = 32) and randomly mix it up with one or another algorithm. Then we calculate the frequencies with which a particular number is in one place or another. Accordingly, the color of the cell in row number i and column number j shows how often i appears in place of j. Black color means that it never appears there, white - that it appears there twice or more often than it should be. If the number falls on the specified place with the theoretically predicted frequency of 1 / n, the cell will have the color hsl (0, 0%, 50%) - a shade of gray, located exactly in the middle between black and white.

If you are using the latest version of the Chrome browser, you see that in the square to the right there are a lot of white or almost white cells arranged according to a certain pattern. This means that when sorting randomly, certain elements tend to end up in certain places. Is it bad? It depends on what you are mixing for. If for some cosmetic effects, then perhaps nothing to worry about. If it is important for you that the user cannot predict which element will be in what place, or if the patterns in mixing are somehow visible visually, then this is bad. And forbid you Hermes to use such mixing for something related to cryptography.

What is surprising, if you use Firefox, then for you both squares are about equally gray. This happens because different browsers use different sorting algorithms (if you're interested, here is my article on this topic). In this case, if you want to be surprised again, add in the address bar? Size = 8 (here is a ready link for the lazy). Firefox sorts large and small arrays differently, surprise!

Upd: comrade mk2 noticed that the uniform-gray square in Firefox will be only at a size equal to the power of two. Need more such attentive comrades, comrades!

Upd2: Stalker_RED and yaZva comrades were not too lazy (unlike me) to make screens in various browsers.

Finally, I add that these fifty-grays of my diagrams are not a criterion, but a sign. From the fact that the square turned out not gray, it follows that the sorting is not uniform, but the reverse is not true. A counterexample is a cyclic shift by a random variable. The frequency of hitting the elements on the ground will be exactly the same, but of course there can be no true randomness of mixing.

Upd3: this thread discusses why random sorting does not give a uniform randomness in Firefox (even if it seems in the diagram what it gives).

How correct?

True Jedi use one of the variations of the Fisher-Yates algorithm . For my demo, I implemented it like this:

 function shuffle(arr){ var j, temp; for(var i = arr.length - 1; i > 0; i--){ j = Math.floor(Math.random()*(i + 1)); temp = arr[j]; arr[j] = arr[i]; arr[i] = temp; } return arr; }

The essence of the algorithm, if translated from JS into Russian, is the following: we take the last element and change its places with a randomly selected element to the right of it (including, possibly, with itself). Then we repeat the same operation for the last but one element, then for the last one, and so on. Voila! (this word translates from “JS into Russian“ return arr; ”).

It is a time of madness

Some readers have been waiting for this whole article, the rest, in principle, can not read this paragraph. I wondered: is it possible to write a compare function such that arr.sort (compare) gives a truly random permutation? Answer: it is possible, but with certain reservations. The first is that the function must be created anew before each sorting. The second - in the array should not be the same elements. So see:

 //  function putToCache(elem, cache){ if(cache.indexOf(elem) != -1){ return; } var i = Math.floor(Math.random()*(cache.length + 1)); cache.splice(i, 0, elem); } //,  ,   function madness(){ var cache = []; return function(a, b){ putToCache(a, cache); putToCache(b, cache); return cache.indexOf(b) - cache.indexOf(a); } } //   function shuffle(arr){ var compare = madness(); return arr.sort(compare); }

It works as follows: when creating a comparator through the closure, it accesses the cache array. Every time arguments are passed to him, he puts them in cache on random places (if they aren’t there yet), and then considers that the element that is in the cache to the right will be larger. That is, in fact, in the cache array, the random order in which the elements must stand is gradually built, and the sort method gradually brings the original array in accordance with this order. If it turns out to have equal elements (equal from the operator’s point of view ===, if we sort the objects, then everything is fine, even if they have the same content. {A: 1}! == {a: 1}), they unfortunately, will go in a row.

That's all. Thanks for reading, I hope you were informative, and especially I hope that I convinced you not to use random sorting almost never. Have a nice coming evening.

Source: https://habr.com/ru/post/358094/

All Articles