When not to use STL algorithms. Example with sets

Comrades, good evening! You have so nicely disassembled the first edition of the book “ C ++ 17 STL. Standard Template Library ” and continue to analyze the second one, so that we finally decided to present here an alternative point of view. The author of today's article is Ivan Čukić, whose writer also owns the book Functional Programming in C ++ , which is being prepared for release by the publishing house Manning. We offer to evaluate his skeptical thoughts, code and calculations.

Preamble

I wanted to call this post “On the perversity of STL-algorithms” to test my own skills for provoking clicks. But then I decided that it was better to write an article for the target audience, and not to write such a post, where those who wish to argue about my blatant theses will fly.

Thus, I can assume that you are interested in algorithms, their complexity and want to write the most perfect code.
')
Algorithms

In today's professional community, C ++ schnick is often advised: to make your program safer, faster, more expressive, etc. - use algorithms from the standard library. I also try to popularize this advice in my books, speeches, seminars ... wherever there is a suitable audience.

Of course, it is absolutely true that if we are forced to write a for loop to solve the task before us, we first need to think about whether the existing algorithms of the standard library (or boost) are suitable for this, and not act blindly.

We still need to know how these algorithms are implemented, what requirements and guarantees are associated with them, what their spatial and temporal complexity.

Usually, if we are faced with a task that exactly meets the requirements of the STL algorithm and can be applied directly, it is this algorithm that will be the most effective solution.

The problem may arise if we need to prepare the data in any way before applying the algorithm.

Intersection of many

Suppose we are writing a tool for C ++ developers that would give hints about replacing default capture options (talking about [=] and [&] ) in lambda expressions, and explicitly display a list of captured variables.

 std::partition(begin(elements), end(elements), [=] (auto element) { ^~~ -  -  ,   [threshold] return element > threshold; });

When parsing a file, we need a collection in which variables from the current and adjacent scopes would be stored. As soon as we encounter a default lambda expression, we will need to see which variables are used there.

As a result, we have two sets: in one there will be variables from the surrounding scopes, and in the other, variables used in the body of a lambda expression.

The list of capture options that we are going to propose for replacement should be the intersection of these two sets (lambda expressions can use global variables that are not required to be captured, and not all variables from surrounding scopes will be used in lambda expressions).

And, if we need an intersection, then we can use the std::set_intersection .

This algorithm is quite beautiful in its simplicity. It accepts two sorted collections and in parallel passes them from beginning to end:

If the actual item in the first collection is equal to the actual item in the second collection, it is added to the result, which the algorithm simply moves to the next item in both collections;
If the actual item in the first collection is less than the actual item in the second collection, then the algorithm simply skips the current item in the first collection;
If the actual item in the first collection is more than the actual item in the second collection, then the algorithm simply skips the current item in the second collection;

At each iteration, at least one element (from the first or from the second input collection) is skipped - therefore, the complexity of the algorithm will be linear - O(m + n) , where m is the number of elements in the first collection, and n is the number of elements in the second collection .

Simple and effective. Until the input collections are sorted.

Sorting

Here's the problem: what if the collections are not pre-ordered?

In the previous example, it would be wise to store variables from the surrounding scopes in a glass-like structure, where the parser could simply add new elements, enter the new scope, and delete the variables of the current scope as soon as it leaves.

Thus, the variables will not be sorted by name, and we will not be able to directly use std::set_intersection for operations on them. Similarly, if we track variables in the body of a lambda expression, then we, most likely, will not be able to save them in sorted form either.

Since std::set_intersection works only with sorted collections, many projects have the following principle: first sort the collections, and then call the algorithm std::set_intersection .

If we forget that the sorting of the stack of variables in our example completely devaluates the entire proc of the stack we defined, the intersection algorithm for unsorted collections will look like this:

 template <typename InputIt1, typename InputIt2, typename OutputIt> auto unordered_intersection_1(InputIt1 first1, InputIt1 last1, InputIt2 first2, InputIt2 last2, OutputIt dest) { std::sort(first1, last1); std::sort(first2, last2); return std::set_intersection(first1, last1, first2, last2, dest); }

What is the complexity of this whole algorithm? It takes quasilinear time to sort, so the total complexity of this approach is O(n log n + m log m + n + m) .

Sort the smaller

Is it possible to do without sorting?

If both collections are not sorted, then we will have to bypass the second collection about each item from the first — to decide whether to include it in the result set. Although this approach is quite common in real projects, it is even worse than the previous one - its complexity is O(n * m) .

Instead of sorting everything in a row, or not sorting anything, recall Zen and choose the Third Way — sort only one collection.

If only one collection is sorted, then we can enumerate all the values from the unsorted collection one by one and check for each value whether it is in the sorted collection. For this, use binary search.

In this case, the time complexity will be O(n log n) for sorting the first collection and O (m log n) for sorting and checking. The overall complexity will be O((n + m) log n) .

If we decided to sort the second collection, rather than the first, then the complexity would be O((n + m) log m) .

To achieve maximum efficiency, we always sort the collection in which there are fewer elements, ensuring that the final complexity of our algorithm is
((m + n) log (min(m, n)) .

The implementation will look like this:

 template <typename InputIt1, typename InputIt2, typename OutputIt> auto unordered_intersection_2(InputIt1 first1, InputIt1 last1, InputIt2 first2, InputIt2 last2, OutputIt dest) { const auto size1 = std::distance(first1, last1); const auto size2 = std::distance(first2, last2); if (size1 > size2) { unordered_intersection_2(first2, last2, first1, last1, dest); return; } std::sort(first1, last1); return std::copy_if(first2, last2, dest, [&] (auto&& value) { return std::binary_search(first1, last1, FWD(value)); }); }

In our example with capture lists in lambda expressions, a collection of variables present in a lambda expression is usually subjected to sorting, since it is likely to be smaller than the collection of all variables from all surrounding scopes.

Hashing

The final option is to build std::unordered_set (implementation of a hash-based unordered set) from a smaller collection, rather than sorting it. In this case, the complexity of the search operations will average O(1) , but it will take time to build std::unordered_set . The complexity of the construction can be from O(n) to O(n*n) , and this is a potential problem.

 template <typename InputIt1, typename InputIt2, typename OutputIt> void unordered_intersection_3(InputIt1 first1, InputIt1 last1, InputIt2 first2, InputIt2 last2, OutputIt dest) { const auto size1 = std::distance(first1, last1); const auto size2 = std::distance(first2, last2); if (size1 > size2) { unordered_intersection_3(first2, last2, first1, last1, dest); return; } std::unordered_set<int> test_set(first1, last1); return std::copy_if(first2, last2, dest, [&] (auto&& value) { return test_set.count(FWD(value)); }); }

The hashing approach fully benefits when it is necessary to calculate the intersections of several sets with a single predetermined small set. That is, we have the set A , the sets B₁ , B₂… and we want to calculate A ∩ B₁, A ∩ B₂…

In this case, you can ignore the complexity of the std::unordered_set construction, and the complexity of calculating each intersection will be linear - O(m) , where m is the number of elements in the second collection.

Control

Of course, it is always useful to check the complexity of the algorithm, but in such cases it is also reasonable to test different approaches using checkpoints. Especially when choosing from the last two options, where we compare binary search and hash-based sets.
My simplest test showed that the first option, where you have to sort both collections, is always the slowest.

Sorting a smaller collection wins a little at std::unordered_set , but not particularly.

Both the second and third approaches are slightly faster than the first in the case when there is an equal number of elements in both collections, and much faster (up to six times), when the number of elements in one collection is about 1000 times more than in the second.

Source: https://habr.com/ru/post/418469/

All Articles

When not to use STL algorithms. Example with sets

More articles: