Merge sort and AS3. Overtaking the native Vector.sort (Array.NUMERIC)

On the left is mergeSort, on the right is native sorting. PepperFlash and 75 million items.

To understand this sorting, you need to understand three principles:

You need to merge only ordered arrays.
An array of one element is considered ordered.
The result of the merge is an ordered array.

The merge procedure cuts the smallest of the first elements of the two input arrays and inserts it into the end of the resulting array until one of the input arrays is empty. After that, the remnants of a non-empty array are simply copied to the end of the resulting array.

To understand the algorithm well enough, you need to implement it.
I did this:

function mergeSort(array:Array):Array { var chunkSize:int = array.length / 2; if(chunkSize >= 1){ var left:Array = mergeSort(array.slice(0, chunkSize)); var right:Array = mergeSort(array.slice(chunkSize)); var result:Array = []; while(left.length && right.length){ if(left[0] < right[0]){ result.push(left.shift()); } else{ result.push(right.shift()); } } return result.concat(left, right); } return array; }

In fact, even such an implementation is faster than some of those found by me in all sorts of blogs.

Recursively, the first elements are sorted first. That is, 0 and 1, then 2 and 3, then 0..1 and 2..3 merge into one array of 0..3, and only after 4 and 5.
(the number before the opening bracket is a priority)

  6 {
     _3_ [
         1 (1, 2)
         2 (3, 4)
     ],
     five[
         _4_ (5, 6)
     ]
 }

But you can sort all subarrays from one element, and then merge them into subarrays of 4 elements.

  6 {
     _four_[
         1 (1, 2)
         2 (3, 4)
     ],
     five[
         _3_ (5, 6)
     ]
 }

This allows you to get rid of recursion.
')
The merge procedure can work directly with the sorted array. There is no need to transfer arrays to it when you can transfer the original array and coordinates of the subarrays to be merged. But it cannot write changes to the initial array at once (for example, if all elements of the second subarray are smaller than all elements of the first, then the procedure will spoil the first subarray), it needs a so-called a buffer to remember the changes without corrupting the input data. At the end of its work, the procedure copies the changes from the buffer to the original array.

Since the result of the merge size is equal to the sum of the sizes of the two input subarrays, you can write the result to the buffer, starting with the index of the first element of the first array. This allows you to select only one buffer, the size of the original array, and not to copy the result of the merge into the original array each time.

In short, we draw an owl:

Code

 function mergeSort(array:Vector.<int>):Vector.<int> { var length:uint = array.length; var buffer:Vector.<int> = new Vector.<int>(length, true); for (var leftChunkSize:uint = 1; leftChunkSize < length; leftChunkSize *= 2) { // low var bufferPointer:uint = 0; var leftChunkStart:uint = 0; for (; leftChunkStart < length; leftChunkStart += leftChunkSize * 2) { // mid var rightChunkStart:uint = Math.min(leftChunkStart + leftChunkSize, length); var rightChunkEnd:uint = Math.min(rightChunkStart + leftChunkSize, length); var leftChunkEnd:uint = rightChunkStart; var leftPointer:uint = leftChunkStart; var rightPointer:uint = rightChunkStart; while (leftPointer < leftChunkEnd && rightPointer < rightChunkEnd) { // hard if (array[leftPointer] < array[rightPointer]) { buffer[bufferPointer++] = array[leftPointer++]; }else { buffer[bufferPointer++] = array[rightPointer++]; } } while (leftPointer < leftChunkEnd) { // hard buffer[bufferPointer++] = array[leftPointer++]; } while (rightPointer < rightChunkEnd) { // hard buffer[bufferPointer++] = array[rightPointer++]; } } var temp:Vector.<int> = array; array = buffer; buffer = temp; } if (result !== array) { for (var i:uint = 0; i < length; i++ ) { result[i] = array[i]; } } return result; }

In the comments indicated bottlenecks.

PC configuration: DDR3-1066 2GB, P6200 (2x2.13GHz).
(“FP” read as “FlashPlayerDebugger 11.9.900.170”, release)

Work speed

	FP	Pepperflash
200,000 items
mergeSort	160ms	81ms
Native sorting	81ms	90ms
2,000,000 items
mergeSort	1800ms	921ms
Native sorting	1378ms	1425ms
20,000,000 items
mergeSort	21374ms	10267ms
Native sorting	20404ms	21175ms

Now let's optimize the bottlenecks:

Math.min is much slower than simple conditions.
The bit shift is faster than multiplication and division by 2.
You can still remember what && actually does, and get rid of it.

We get:

Code

 function mergeSort(array:Vector.<int>, buffer:Vector.<int> = null):Vector.<int> { var length:uint = array.length; var result:Vector.<int> = array; buffer = buffer ? buffer : new Vector.<int>(length, true); for (var chunkSize:uint = 1; chunkSize < length; chunkSize <<= 1) { var bufferPointer:uint = 0; var leftChunkStart:uint = 0; for (; leftChunkStart < length; leftChunkStart += chunkSize << 1) { var rightChunkStart:uint = leftChunkStart + chunkSize; var rightChunkEnd:uint = rightChunkStart + chunkSize; if (length < rightChunkEnd) { rightChunkEnd = length; if (length < rightChunkStart) { rightChunkStart = length; } } var leftChunkEnd:uint = rightChunkStart; var leftPointer:uint = leftChunkStart; var rightPointer:uint = rightChunkStart; while (leftPointer - leftChunkEnd & rightPointer - rightChunkEnd) { if (array[leftPointer] < array[rightPointer]) { buffer[bufferPointer++] = array[leftPointer++]; }else { buffer[bufferPointer++] = array[rightPointer++]; } } while (leftPointer < leftChunkEnd) { buffer[bufferPointer++] = array[leftPointer++]; } while (rightPointer < rightChunkEnd) { buffer[bufferPointer++] = array[rightPointer++]; } } var temp:Vector.<int> = array; array = buffer; buffer = temp; } if (result !== array) { for (var i:uint = 0; i < length; i++ ) { result[i] = array[i]; } } return result; }

10 ^ 8 elements such an implementation I have sorts in 47 seconds in PepperFlash.

Work speed

	FP	Pepperflash
200,000 items
mergeSort	130ms	64ms
Native sorting	81ms	90ms
2,000,000 items
mergeSort	1462ms	730ms
Native sorting	1378ms	1425ms
20,000,000 items
mergeSort	16937ms	8435ms
Native sorting	20404ms	21175ms
200 items and 1000 sorts
mergeSort	88ms	38ms
Native sorting	70ms	48ms
Array creation with random elements	36ms	10ms
2000 items and 10,000 sorts
mergeSort	10077ms	4809ms
Native sorting	6885ms	5912ms
Array creation with random elements	2178ms	760ms

in the case of fully sorted arrays, both sorts work faster (for example, with 2,000,000 items about 30% faster, both).
Native sorting requires approximately n * 1.5 additional memory, mergeSort - exactly n.

As a result, in FP there is little profit from these gestures. But in PepperFlash you can see the numbers almost two times less.

Source: https://habr.com/ru/post/207784/

All Articles

Merge sort and AS3. Overtaking the native Vector.sort (Array.NUMERIC)

More articles: