Evolution of a single algorithm

Some time ago, my colleague asked for help with one problem. I solved the problem for him, but in addition, it seemed to me that several algorithms and programming techniques can be explained by solving this problem. And also to show the acceleration of the execution time of the algorithm from 25 seconds to 40 ms.

Formulation of the problem

For a personal project, my colleague needed an algorithm to find for a given video of fifty maximum similar videos. Similarity was supposed to be assessed by the number of matching tags exposed. The more tags in a video match, the more similar they are. From this you can immediately draw several conclusions:

All tags under the video can be combined into one group;
there will be no more such groups than the videos themselves;
if a video looks like another video from a group of tags, then it is just as much like other videos from that group;

It turns out that it is enough to work only with groups of tags. In the first version, a colleague decided to store tags in the tag table: each video has a link to the tag group ID, and the groups themselves are a sequence of boolean values that indicate whether the corresponding tag is set. In C #, a group of tags looks like this:

public class TagsGroup { bool[] InnerTags { get; } }

A colleague assumed that he would have no more than one million videos on the site, and no more than 4000 different tags, for a round account you can take 4096 = 2 ^ 12.
Then the TagsGroup class can be represented as follows:

 public class TagsGroup { public const int TagsGroupLength = 4096; bool[] InnerTags { get; } public TagsGroup(bool[] innerTags) { if (innerTags == null) throw new ArgumentException(nameof(innerTags)); if (innerTags.Length != TagsGroupLength) { throw new ArgumentException(nameof(innerTags)); } InnerTags = innerTags; } }

Now you need to check the two groups of tags for similarity. In the current conditions, this turns into a simple check for the coincidence of true in the corresponding elements of the InnerTags arrays of two groups of tags:

 public static int MeasureSimilarity(TagsGroup a, TagsGroup b) { int result = 0; for (int i = 0; i < TagsGroupLength; i++) { if (a.InnerTags[i] && a.InnerTags[i] == b.InnerTags[i]) result++; } return result; }

Now it remains to calculate the similarity of the desired group of tags with each existing group and select the fifty most similar ones. I set myself another condition to ensure the sustainability of the sample, i.e. in the final sample there will be fifty groups of tags for which MeasureSimilarity greatest result, while groups of tags with the same MeasureSimilarity smaller index for those who had a smaller index in the original existing group. You can read more, for example, here: https://ru.wikipedia.org/wiki/Stable_sort .
To solve this problem, I decided to make a class SimilarTagsCalculator , here is its code:

SimilarTagsCalculator

 class SimilarTagsCalculator { TagsGroup[] Groups { get; } public SimilarTagsCalculator(TagsGroup[] groups) { if (groups == null) throw new ArgumentException(nameof(groups)); Groups = groups; } public TagsGroup[] GetFiftyMostSimilarGroups(TagsGroup value) { const int resultLength = 50; //,          var list = new List<TagsSimilarityInfo>(resultLength); //      for (int groupIndex = 0; groupIndex < Groups.Length; groupIndex++) { TagsGroup tagsGroup = Groups[groupIndex]; //      int similarityValue = TagsGroup.MeasureSimilarity(value, tagsGroup); // -  TagsSimilarityInfo newInfo = new TagsSimilarityInfo(groupIndex, similarityValue); //    ,     , if (list.Count == resultLength && list[resultLength - 1].CompareTo(newInfo) == -1) { continue; //     } //   ,    -  int index = ~list.BinarySearch(newInfo); list.Insert(index, newInfo); // if (list.Count > resultLength) { //    , //   , ..    list.RemoveAt(resultLength); } } // -   TagsGroup[] result = new TagsGroup[resultLength]; for (int i = 0; i < resultLength; i++) { result[i] = Groups[list[i].Index]; } return result; } }

and the TagsSimilarityInfo structure:

TagsSimilarityInfo

 struct TagsSimilarityInfo : IComparable<TagsSimilarityInfo>, IComparable { public int Index { get; } public int Similarity { get; } public TagsSimilarityInfo(int index, int similarity) { Index = index; Similarity = similarity; } public bool Equals(TagsSimilarityInfo other) { return Index == other.Index && Similarity == other.Similarity; } public override bool Equals(object obj) { return obj is TagsSimilarityInfo other && Equals(other); } public override int GetHashCode() { unchecked { return (Index * 397) ^ Similarity; } } public int CompareTo(TagsSimilarityInfo other) { int similarityComparison = other.Similarity.CompareTo(Similarity); return similarityComparison != 0 ? similarityComparison : Index.CompareTo(other.Index); } public int CompareTo(object obj) { if (ReferenceEquals(null, obj)) return 1; return obj is TagsSimilarityInfo other ? CompareTo(other) : throw new ArgumentException($"Object must be of type {nameof(TagsSimilarityInfo)}"); } }

I prepared three benchmarks for this algorithm:

fully random benchmark, i.e. the number of tags set in groups is random and the group of tags with which we will compare is also random;
the number of tags put in groups goes on ascending, we will compare with the group in which all tags are exposed. It turns out the most suitable should be some of the last groups of tags;
similar as above, but the number of tags set is decreasing. The most suitable will be the first 50 groups of tags;

Here are the benchmark results for a million groups:

BenchmarkDotNet = v0.11.5, OS = Windows 10.0.17134.765 (1803 / April2018Update / Redstone4)
Intel Core i7-6700 CPU 3.40GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
Frequency = 3328126 Hz, Resolution = 300.4694 ns, Timer = TSC
.NET Core SDK = 3.0.100-preview5-011568
[Host]: .NET Core 3.0.0-preview5-27626-15 (CoreCLR 4.6.27622.75, CoreFX 4.700.19.22408), 64bit RyuJIT

Method	Mean	Error	Stddev	Allocated
Randomtest	25.054 s	0.1786 s	0.1670 s	1.53 KB
AscendantTest	4.180 s	0.0174 s	0.0162 s	1.53 KB
DescendantTest	4.147 s	0.0118 s	0.0104 s	1.53 KB

The scatter of the execution time is very large, besides, 25 seconds is a very long time, my colleague does not agree to wait so long. So let's do the optimization. Now there are three main areas to speed up the program:

MeasureSimilarity method;
algorithm in the body of the loop in GetFiftyMostSimilarGroups ;
the loop itself in GetFiftyMostSimilarGroups ;

We will consider each of the three directions sequentially.

Branch prediction

Consider first the MeasureSimilarity method.

 public static int MeasureSimilarity(TagsGroup a, TagsGroup b) { int result = 0; for (int i = 0; i < TagsGroupLength; i++) { if (a.InnerTags[i] && a.InnerTags[i] == b.InnerTags[i]) result++; } return result; }

In the previous benchmark, there is a very large variation in execution time between the random test and any of the consecutive ones. Groups of tags for successive tests were created according to the following principle:

The required number of groups was divided into packages. Number of packages - the maximum number of tags in a group;
for each group in the i-th package, the first i tags were set;

It turns out that each group of tags in these tests consists of two consecutive parts of the exposed and non-exposed tags. MeasureSimilarity has all the prerequisites for the processor to predict the branching of the processor to have a significant effect under the current conditions. To check this, it is enough to write a benchmark that compares the work time of the MeasureSimilarity for random data and for successive ones:

 int GetSimilaritySum(TagsGroup[] tagsGroups) { int result = 0; foreach (TagsGroup tagsGroup in tagsGroups) { result += TagsGroup.MeasureSimilarity(tagsGroup, etalon); } return result; } [Benchmark] public int Sorted() => GetSimilaritySum(sortedGroups); [Benchmark] public int Unsorted() => GetSimilaritySum(unsortedGroups);

Method	Mean	Error	Stddev
Sorted	3.704 s	0.0411 s	0.0364 s
Unsorted	8.211 s	0.0381 s	0.0338 s

A million groups of tags were tested, but in Sorted in each group there were several tags at the beginning, and then none at all, and in Unsorted the same number of tags were randomly scattered throughout the group.
The difference of 5 seconds is impressive, and something needs to be done about it. To get rid of the influence of branch prediction and generally speed up the method, you need to get rid of the branches themselves. There is only one branching in MeasureSimilarity - checking that the corresponding tags are displayed in two groups. Let's estimate in which cases the condition will be true, for this we will make a truth table of the condition:

a.InnerTags [i]	b.InnerTags [i]	Result
False	False	False
False	True	False
True	False	False
True	True	True

The truth table completely coincides with the logical "AND", i.e. the result is true if and only if both tags are true, then the condition can be reduced to: if (a.InnerTags[i] && b.InnerTags[i]) . But so the condition still remains. In the next step, we will make it so that the addition to result is always performed, for this we rewrite the body of the loop like this:

 int t = a.InnerTags[i] && b.InnerTags[i] ? 1 : 0; result += t;

We still did not get rid of the condition and in fact even made the method slower. But now it has become obvious that if the type of InnerTags from bool to byte (1 for true and 0 for false) is changed, then the condition in the ternary operator can be eliminated. Then the TagsGroup class will look like this:

TagsGroup

 public class TagsGroup { public const int TagsGroupLength = 4096; byte[] InnerTags { get; } public static int MeasureSimilarity(TagsGroup a, TagsGroup b) { int result = 0; for (int i = 0; i < TagsGroupLength; i++) { int t = a.InnerTags[i] & b.InnerTags[i]; result += t; } return result; } public TagsGroup(bool[] innerTags) { if (innerTags == null) throw new ArgumentException(nameof(innerTags)); if (innerTags.Length != TagsGroupLength) { throw new ArgumentException(nameof(innerTags)); } InnerTags = new byte[TagsGroupLength]; for (int i = 0; i < TagsGroupLength; i++) { InnerTags[i] = (byte) (innerTags[i] ? 1 : 0); } } }

Here are the benchmark results for the updated MeasureSimilarity :

Method	Mean	Error	Stddev
Sorted	3.180 s	0.0118 s	0.0111 s
Unsorted	3.236 s	0.0622 s	0.0764 s

It was:

Method	Mean	Error	Stddev
Sorted	3.704 s	0.0411 s	0.0364 s
Unsorted	8.211 s	0.0381 s	0.0338 s

but for the updated main trademark:

Method	Mean	Error	Stddev	Allocated
Randomtest	3.219 s	0.0492 s	0.0436 s	1.53 KB
AscendantTest	3.223 s	0.0117 s	0.0110 s	1.53 KB
DescendantTest	3.422 s	0.0697 s	0.0999 s	1.53 KB

It was:

Method	Mean	Error	Stddev	Allocated
Randomtest	25.054 s	0.1786 s	0.1670 s	1.53 KB
AscendantTest	4.180 s	0.0174 s	0.0162 s	1.53 KB
DescendantTest	4.147 s	0.0118 s	0.0104 s	1.53 KB

In my opinion, it has become great. For those who are sure that all acceleration happened only because the boolean type was replaced by byte, I launched a benchmark for such a cycle body:

 int t = a.InnerTags[i] & b.InnerTags[i]; if (t == 1) result += t;

and here are the results I got:

Method	Mean	Error	Stddev
Sorted	3.760 s	0.0746 s	0.1541 s
Unsorted	8.628 s	0.1699 s	0.2382 s

Data packing

In each group there are a lot of tags, and their number can not be reduced. In addition, it is necessary to compare tags with identical indices, and you cannot give a final answer without checking all the tags. So, in any case, we will have to iterate over the whole group of tags. It would be nice to be able to somehow parallelize this task so that it can conditionally handle several tags in one operation. You can do this through true parallelization, or through special data packaging, which we will use. Each tag is now 1 or 0. In result result of the AND operation is simply accumulated. But the same logical operation can be applied not only on single-bit numbers. C # allows you to do this without any problems, up to 64-bit numbers (you can do more with BitArray , but this is not the same). If two groups of tags are presented as a set of 64 bit numbers with the corresponding bits set, then it will be possible to carry out an “AND” operation on each such group of 64 bit numbers. It is not clear what to do with the result. Let's look again at the body of the loop:

 int t = a.InnerTags[i] & b.InnerTags[i]; result += t;

result is incremented by 1 each time t == 1 and does not change when t == 0. As a result, result will be equal to how many times the result of a.InnerTags[i] & b.InnerTags[i] one. Accordingly, it would be possible to save all the results of a.InnerTags[i] & b.InnerTags[i] into some array, and in result write only the number of units in this array. When performing the operation "And" over more than n-bit numbers there will be an n-bit result and it will suffice only to know how many bits are set among n. The number of bits set in the number is invariable, which means you can predict these numbers. There is no sense in prepending for 64 bits, since we will not find as much RAM. For 32 bits, you can already find space on modern computers, but this is still very much. Memory for 16 bits is easy to find, but the calculation will be relatively long. As a compromise, we assume for 8 bit numbers:

GenerateCountOfSettedBits

 static readonly byte[] CountOfSettedBits = GenerateCountOfSettedBits(); static byte[] GenerateCountOfSettedBits() { //  result   i      i- . byte[] result = new byte[256]; //  ,      i   , //        int[] b = new int[8]; //     for (int i = 1; i < 256; i++) { //       int settedBitsCount = 0; //,       int m = 1; //   for (int j = 0; j < 8; j++) { //     b[j] += m; //  ,       2. m = b[j] >> 1; //        b[j] = b[j] & 1; //,        settedBitsCount += b[j]; } result[i] = (byte) settedBitsCount; //   } return result; }

Now the TagsGroup constructor looks like this:

 const int BucketSize = 8; public TagsGroup(bool[] innerTags) { if (innerTags == null) throw new ArgumentException(nameof(innerTags)); if (innerTags.Length != TagsGroupLength) { throw new ArgumentException(nameof(innerTags)); } int index = 0; //    InnerTags = new byte[TagsGroupLength / BucketSize]; //   for (int i = 0; i < TagsGroupLength / BucketSize; i++) { //     for (int j = 0; j < BucketSize; j++, index++) { //    2,      InnerTags[i] <<= 1; //    InnerTags[i] += (byte) (innerTags[index] ? 1 : 0); } } }

And MeasureSimilarity began to look like this:

 public static int MeasureSimilarity(TagsGroup a, TagsGroup b) { int result = 0; for (int i = 0; i < TagsGroupLength / BucketSize; i++) { int t = a.InnerTags[i] & b.InnerTags[i]; result += CountOfSettedBits[t]; } return result; }

You can run a large benchmark and make sure that everything is better:

Method	Mean	Error	Stddev	Allocated
Randomtest	560.5 ms	8.285 ms	7.344 ms	1.53 KB
AscendantTest	570.1 ms	4.108 ms	3.431 ms	1.53 KB
DescendantTest	608.1 ms	5.691 ms	5.324 ms	1.53 KB

Is it possible to make the MeasureSimilarity method even faster? Of course! To do this, it is enough to realize that the general-purpose registers are now mostly 64-bit, and we are driving eight-bit data into them. To do this, the size of the package in which the source tags are packaged, will be increased to 64 bits and we will rewrite the necessary methods:

 const int BucketSize = 64; ulong[] InnerTags { get; } public static int MeasureSimilarity(TagsGroup a, TagsGroup b) { int result = 0; for (int i = 0; i < TagsGroupLength / BucketSize; i++) { ulong t = a.InnerTags[i] & b.InnerTags[i]; for (int j = 0; j < BucketSize / 8; j++) { result += CountOfSettedBits[t & 255]; t >>= 8; } } return result; }

and it will turn out:

Method	Mean	Error	Stddev	Allocated
Randomtest	533.3 ms	4.802 ms	4.492 ms	1.53 KB
AscendantTest	550.9 ms	5.435 ms	5.084 ms	1.53 KB
DescendantTest	567.6 ms	3.879 ms	3.439 ms	1.53 KB

Then you can expand the inner loop:

MeasureSimilarity

 public static int MeasureSimilarity(TagsGroup a, TagsGroup b) { int result = 0; for (int i = 0; i < TagsGroupLength / BucketSize; i++) { ulong t = a.InnerTags[i] & b.InnerTags[i]; result += CountOfSettedBits[t & 255]; t >>= 8; result += CountOfSettedBits[t & 255]; t >>= 8; result += CountOfSettedBits[t & 255]; t >>= 8; result += CountOfSettedBits[t & 255]; t >>= 8; result += CountOfSettedBits[t & 255]; t >>= 8; result += CountOfSettedBits[t & 255]; t >>= 8; result += CountOfSettedBits[t & 255]; t >>= 8; result += CountOfSettedBits[t & 255]; } return result; }

Method	Mean	Error	Stddev	Allocated
Randomtest	370.5 ms	2.802 ms	2.484 ms	1.53 KB
AscendantTest	395.8 ms	2.682 ms	2.509 ms	1.53 KB
DescendantTest	419.5 ms	3.352 ms	2.971 ms	1.53 KB

Is it even faster? Yes! If you use innovations from .NET Core 3.0. Although this version is still in the preview, but in it from the very beginning there is a realization of some intrinsics. The Intel Intrinsic Guide has an _mm_popcnt_u64 intrinsic. Which according to the description: " Count the number of bits in the unsigned 64-bit integer a, and the count in dst. ". This is exactly what we are trying to achieve! In .NET Core 3.0 Preview 5, this intrinsic is implemented in System.Runtime.Intrinsics.X86.Popcnt.X64.PopCount (As I rightly noted in a-tk comments, you need to check that the processor supports them. In this case, check the System.Runtime.Intrinsics.X86.Popcnt.X64.IsSupported condition System.Runtime.Intrinsics.X86.Popcnt.X64.IsSupported ). With its use, the code of the MeasureSimilarity method will become as follows:

 public static int MeasureSimilarity(TagsGroup a, TagsGroup b) { int result = 0; for (int i = 0; i < TagsGroupLength / BucketSize; i++) { ulong t = a.InnerTags[i] & b.InnerTags[i]; result += (int) System.Runtime.Intrinsics.X86.Popcnt.X64.PopCount(t); } return result; }

And the lead time is:

Method	Mean	Error	Stddev	Allocated
Randomtest	59.33 ms	1.148 ms	0.9585 ms	1.53 KB
AscendantTest	74.87 ms	1.479 ms	1.9748 ms	1.53 KB
DescendantTest	119.46 ms	2.321 ms	2.8509 ms	1.53 KB

Impressive.
I am not aware of the methods that can significantly speed up MeasureSimilarity and at the same time not greatly spoil readability. I think with this method you can finish.

Data structures

Now let's GetFiftyMostSimilarGroups at the loop body in the GetFiftyMostSimilarGroups method:

GetFiftyMostSimilarGroups

 public TagsGroup[] GetFiftyMostSimilarGroups(TagsGroup value) { const int resultLength = 50; List<TagsSimilarityInfo> list = new List<TagsSimilarityInfo>(50); for (int groupIndex = 0; groupIndex < Groups.Length; groupIndex++) { TagsGroup tagsGroup = Groups[groupIndex]; int similarityValue = TagsGroup.MeasureSimilarity(value, tagsGroup); TagsSimilarityInfo newInfo = new TagsSimilarityInfo(groupIndex, similarityValue); if (list.Count == resultLength && list[resultLength - 1].CompareTo(newInfo) == -1) { continue; } int index = ~list.BinarySearch(newInfo); list.Insert(index, newInfo); if (list.Count > resultLength) { list.RemoveAt(resultLength); } } TagsGroup[] result = new TagsGroup[resultLength]; for (int i = 0; i < resultLength; i++) { result[i] = Groups[list[i].Index]; } return result; }

Let me remind you briefly what is happening here:

inside the list is stored a sorted list of the fifty most appropriate groups of tags, in fact, from the TagsSimilarityInfo , when comparing TagsSimilarityInfo ;
insert the new group in question into the list with sorting preserved;
if there are more than fifty items in the list, then the least similar group is deleted (its info-object will be the largest and will be at the very end of the list );

Those. it turns out that we need to very quickly find the largest element of the collection, be able to quickly insert and delete. To solve such problems, there are special data structures. The first thing that comes to mind is a bunch . In her case, the insertion is performed in O (log N), the maximum is obtained in O (1), the element is deleted in O (log N). The only problem is that you cannot iterate over the heap by increasing the elements without modifying it. There is no binary heap in the BCL, so I wrote it myself:

Binary heap

 public class BinaryHeap<T>:IEnumerable<T> where T : IComparable<T> { readonly List<T> innerList; public BinaryHeap(int capacity) { innerList = new List<T>(capacity); } public int Count => innerList.Count; public T Max => innerList[0]; public void Add(T value) { innerList.Add(value); int i = Count - 1; int parent = (i - 1) >> 1; while (i > 0 && innerList[parent].CompareTo(innerList[i]) == -1) { Swap(i, parent); i = parent; parent = (i - 1) >> 1; } } void Swap(int a, int b) { T temp = innerList[a]; innerList[a] = innerList[b]; innerList[b] = temp; } void Heapify(int i) { for (;;) { int leftChild = (i << 1) | 1; int rightChild = (i + 1) << 1; int largestChild = i; if (leftChild < Count && innerList[leftChild].CompareTo(innerList[largestChild]) == 1) { largestChild = leftChild; } if (rightChild < Count && innerList[rightChild].CompareTo(innerList[largestChild]) == 1) { largestChild = rightChild; } if (largestChild == i) { break; } Swap(i, largestChild); i = largestChild; } } public void RemoveMax() { innerList[0] = innerList[Count - 1]; innerList.RemoveAt(Count - 1); Heapify(0); } public IEnumerator<T> GetEnumerator() { return innerList.GetEnumerator(); } IEnumerator IEnumerable.GetEnumerator() { return ((IEnumerable) innerList).GetEnumerator(); } }

the corresponding implementation of the GetFiftyMostSimilarGroups method can be found in the source code of the article (link below).
In addition to the heap, a binary search tree may be suitable. A balanced binary search tree can provide an insert for O (log N), get a maximum for O (log N), delete an item for O (log N). The advantage of such a structure is that it can be iterated in ascending order and, moreover, the red-black search tree in BCL is implemented inside SortedSet (in a large framework, obtaining the maximum is much slower than in .netcore 3.0, and allocates memory). The implementation of GetFiftyMostSimilarGroups for SortedSet can be found in the source for the article.
Benchmark results for all three implementations of GetFiftyMostSimilarGroups :

Method	Sorting algorithm	Mean	Allocated
Randomtest	List	60.06 ms	1704 B
Randomtest	SortedSet	65.46 ms	24384 B
Randomtest	Heap	60.55 ms	2912 B
AscendantTest	List	75.42 ms	1704 B
AscendantTest	SortedSet	161.12 ms	9833424 B
AscendantTest	Heap	86.87 ms	2912 B
DescendantTest	List	119.23 ms	880 B
DescendantTest	SortedSet	125.03 ms	3024 B
DescendantTest	Heap	118.62 ms	2088 B

The original implementation with a sheet wins almost everywhere in time, and precisely everywhere from memory. This happens due to the fact that the algorithm with a sheet has an insertion done by O (log N) to search, and almost O (1) to insert, since such a small number of elements are copied very quickly, the maximum is obtained in O (1), the element is also deleted in O (1), since In .net, deleting the last element from a sheet is replaced with writing to the last element of an empty value (in the .net core for structures, nothing is even written). If it were required to produce not 50, but let's say 1000 of the most similar groups, then, most likely, the algorithm with the sheet would not fit. In fact, all this is a little speculative reasoning, because it is possible to tune each of the algorithms.

Multithreading

Now it remains to try to improve the loop itself in GetFiftyMostSimilarGroups . Only multithreading comes to mind. The idea is to split the entire list of groups into several packages. In each package, find the 50 most similar groups of tags, and then among them find the final 50 most similar.
The multithreaded version of GetFiftyMostSimilarGroups looks like this:

GetFiftyMostSimilarGroupsMultiThread

 public TagsGroup[] GetFiftyMostSimilarGroupsMultiThread(TagsGroup value) { const int resultLength = 50; // ,     const int threadsCount = 4; //   int bucketSize = Groups.Length / threadsCount; var tasks = new Task<List<TagsSimilarityInfo>>[threadsCount]; for (int i = 0; i < threadsCount; i++) { int leftIndex = i * bucketSize; //    int rightIndex = (i + 1) * bucketSize; //    //    tasks[i] = Task<List<TagsSimilarityInfo>>.Factory.StartNew(() => GetFiftyMostSimilarGroupsMultiThreadCore(value, leftIndex, rightIndex)); } Task.WaitAll(tasks); //    var taskResults = new List<TagsSimilarityInfo>[threadsCount]; for (int i = 0; i < threadsCount; i++) { taskResults[i] = tasks[i].Result; } //      return MergeTaskResults(resultLength, threadsCount, taskResults); }

GetFiftyMostSimilarGroupsMultiThreadCore GetFiftyMostSimilarGroups :

GetFiftyMostSimilarGroupsMultiThreadCore

 List<TagsSimilarityInfo> GetFiftyMostSimilarGroupsMultiThreadCore(TagsGroup value, int leftIndex, int rightIndex) { const int resultLength = 50; List<TagsSimilarityInfo> list = new List<TagsSimilarityInfo>(resultLength); for (int groupIndex = leftIndex; groupIndex < rightIndex; groupIndex++) { TagsGroup tagsGroup = Groups[groupIndex]; int similarityValue = TagsGroup.MeasureSimilarity(value, tagsGroup); TagsSimilarityInfo newInfo = new TagsSimilarityInfo(groupIndex, similarityValue); if (list.Count == resultLength && list[resultLength - 1].CompareTo(newInfo) == -1) { continue; } int index = ~list.BinarySearch(newInfo); list.Insert(index, newInfo); if (list.Count > resultLength) { list.RemoveAt(resultLength); } } return list; }

MergeTaskResults . - taskResults . , , . , threadsCount , : , , , :

MergeTaskResults

 TagsGroup[] MergeTaskResults(int resultLength, int threadsCount, List<TagsSimilarityInfo>[] taskResults) { TagsGroup[] result = new TagsGroup[resultLength]; int[] indices = new int[threadsCount]; for (int i = 0; i < resultLength; i++) { int minIndex = 0; TagsSimilarityInfo currentBest = taskResults[minIndex][indices[minIndex]]; for (int j = 0; j < threadsCount; j++) { var current = taskResults[j][indices[j]]; if (current.CompareTo(currentBest) == -1) { minIndex = j; currentBest = taskResults[minIndex][indices[minIndex]]; } } int groupIndex = currentBest.Index; result[i] = Groups[groupIndex]; indices[minIndex]++; } return result; }

indices taskResults ;
minIndex — taskResults , ;
currentBest — - ;
current — - ;

Method	Mean	Error	StdDev	Allocated
RandomTest	28.76 ms	0.5677 ms	1.414 ms	1.4 KB
AscendantTest	32.36 ms	0.8930 ms	2.591 ms	1.4 KB
DescendantTest	41.36 ms	0.8908 ms	2.626 ms	1.4 KB

Method	Mean	Error	StdDev	Allocated
RandomTest	25054 ms	1786 ms	1670 ms	1.53 KB
AscendantTest	4180 ms	174 ms	162 ms	1.53 KB
DescendantTest	4147 ms	118 ms	104 ms	1.53 KB

. . , , 4 50. , , .

Source: https://habr.com/ru/post/454850/

All Articles