.sorting in Perl 6

Sorting lists is a very common programming problem, and Perl 6 has several improvements to the sort function to help you solve this problem.

In language there is an ordinary typical sort:

#       my @sorted = @unsorted.sort; #  sort @unsorted;

')
As in Perl 5, you can configure a function to compare:

  #   my @sorted = @unsorted.sort: { $^a <=> $^b };

  #  ,     my @sorted = sort { $^a <=> $^b }, @unsorted;

  #   (  cmp  Perl 5 ) my @sorted = @unsorted.sort: { $^a leg $^b };

  #     my @sorted = @unsorted.sort: { $^a cmp $^b };

If you write the comparison condition in parentheses, then you do not need a colon. This is useful when you want to build a chain of other methods that stretches for sort:

  my @topten = @scores.sort( { $^b <=> $^a } ).list.munch(10);

Specification: in Perl 6, the variables $ a and $ b have no special purpose, as in Perl 5. In the sorting comparison block, as in any other block, you can use the usual variables ($ var), positional ($ ^ var) or " whatever ”(*).

During the sorting, you can immediately apply the conversion function:

  my @sorted = @unsorted.sort: { foo($^a) cmp foo($^b) };

but for each iteration, foo () is re-calculated. For small lists, this is not a problem, but for large lists this becomes a problem, especially if this function is difficult to calculate.

In Perl 5, it is typical to use the Schwarz transformation when you need to sort the list not by elements, but by some derivatives of them.

When using the Schwartz transformation, each transformation is counted once, then the elements are sorted according to the results of the transformation, and then the original elements are taken from this list.

Suppose you need to sort the list of words (“aaaa”, “a”, “aa”) by word length. You must first create a list (["aaaa", 4], ["a", 1], ["aa", 2]), then sort it by numeric value, and then from the resulting list (["a", 1] , ["Aa", 2], ["aaaa", 4]) delete numbers. As a result, a list will be obtained (“a”, “aa”, “aaaa”).

  #   Perl 5 @sorted = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [$_, foo($_)] } @unsorted;

In Perl 6, it can also be done, but in addition, some intelligent algorithms are built into sort. If your function has a number of operands 0 or 1, Perl 6 notices this and automatically adds the Schwartz transform.

Take a look at the examples.

Case Sorting

Bring each element to lower case, sort, and return the original elements.

  my @sorted = @unsorted.sort: { .lc };

Simplicity.

Sort by word length

Sort the list of strings by the number of characters, from short to long.

  my @sorted = @unsorted.sort: { .chars };

Or longest to shortest:

  my @sorted = @unsorted.sort: { -.chars };

Multiple sorting comparators

Several functions can be passed to the sort block for comparison, and it will perform as many transformations as necessary to achieve the first inequality result.

Sort by word length, while there is a secondary sort, which sorts words with the same length in order of ASCII characters.

  .say for @a.sort: { $^a.chars, $^a } ;

Sorting in Perl 6 is stable, so you can sort the list in ASCII order, and then by word length.

  .say for @a.sort.sort: { $^a.chars };

It works - but it is sorted twice. You can rewrite it as follows:

  .say for @a.sort: { $^a.chars <=> $^b.chars || $^a leg $^b };

It also works, but now the automatic conversion of Schwartz is lost.

Or you can apply a natural sorting transformation:

  .say for @a.sort: { $^a.&naturally, $^a };

"Something? Natural sorting? Where did this come from? ” Glad you asked.

Natural sorting

Standard sorting by string produces the result in an "ASCII" string. The numbers in front of uppercase letters, and then lowercase. People are often surprised to get the following as a result of sorting:

  0 1 100 11 144th 2 21 210 3rd 33rd AND ARE An Bit Can and by car d1 d10 d2

And this, of course, is correct, but not entirely intuitive, especially for non-programmers. Natural sorting would order the list in ascending numbers, and then alphabetically.

Here are the same lines, sorted naturally:

  0 1 2 3rd 11 21 33rd 100 144th 210 An AND and ARE Bit by Can car d1 d2 d10

To do this, you need to perform a simple conversion. I will use the subst method. This is an analogue of the operator s /// in the form of a method.

  .subst(/(\d+)/, -> $/ { 0 ~ $0.chars.chr ~ $0 }, :g)

First, we “catch” a group of one or more consecutive digits. The construction '-> $ / {...}' is a “block with spikes”. It means "to output the contents of the array of matches ($ /) within the scope of the next code block ({...})". The block builds a replacement string: “0”, to which the number of digits in a number is added, expressed as an ASCII character, after which the string of digits itself is added. Prefix: g means "globally."

And we also need to sort case-insensitively, so we attach another .lc method, and we get the result:

  .lc.subst(/(\d+)/, -> $/ { 0 ~ $0.chars.chr ~ $0 }, :g)

We turn it into a function:

  sub naturally ($a) { $a.lc.subst(/(\d+)/, -> $/ { 0 ~ $0.chars.chr ~ $0 }, :g) }

It works almost correctly, but not quite. Values zamaplennye in one transformation, will return in the order in which they were received. For example, the words 'THE', 'The', and 'the' will return in the order they are received, not in the sort order. The easiest way is to add the original value to the transformed tail.

In total, the final function of natural sorting will be:

  sub naturally ($a) { $a.lc.subst(/(\d+)/, -> $/ { 0 ~ $0.chars.chr ~ $0 }, :g) ~ "\x0" ~ $a }

Since it works with values in turn, we still get the Schwartz transformation for free. Now it can be used as a sorting modifier.

  .say for <0 1 100 11 144th 2 21 210 3rd 33rd AND ARE An Bit Can and by car d1 d10 d2>.sort: { .&naturally };

Or sort ip addresses:

  #   ip my @ips = ((0..255).roll(4).join('.')for 0..99); .say for @ips.sort: { .&naturally }; 4.108.172.65 5.149.121.70 10.24.201.53 11.10.90.219 12.83.84.206 12.124.106.41 12.162.149.98 14.203.88.93 16.18.0.178 17.68.226.104 21.201.181.225 23.61.166.202 23.205.73.104 24.250.90.75 35.56.124.120 36.158.70.141 40.149.118.209 40.238.169.146 52.107.62.129 55.119.95.120 56.39.105.245 ...

Or sort the list of files, or anything that contains a mixture of numbers and symbols.

Source: https://habr.com/ru/post/267887/

All Articles

.sorting in Perl 6

Case Sorting

Sort by word length

Multiple sorting comparators

Natural sorting

More articles: