The subtleties of Scala: we study CanBuildFrom

In the standard Scala library, the collection methods ( map , flatMap , scan , etc.) take an instance of CanBuildFrom type as an implicit parameter. In this article we will examine in detail what this treit is for, how it works and how it can be useful for the developer.

How it works

The main purpose served by CanBuildFrom is to provide the compiler with the result type for the map , flatMap, and the like methods, for example, the definition of the map in the TraversableLike wizard:

def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That

The method returns an object of type That , which appears in the description only as a parameter for CanBuildFrom . The appropriate CanBuildFrom instance is selected by the compiler based on the type of the original Repr collection and the type of result of the user-defined function B. The choice is made from the set of values declared in the Predef object and companions of collections (the rules for choosing implicit values deserve a separate article and are described in detail in the language specification ).

In fact, using CanBuildFrom results in the same type of result as in the case of the simplest parameterized method:

 scala> def f[T](x: List[T]): T = x.head f: [T](x: List[T])T scala> f(List(3)) res0: Int = 3 scala> f(List(3.14)) res1: Double = 3.14 scala> f(List("Pi")) res2: String = Pi

That is, when you call

 List(1, 2, 3).map(_ * 2)

the compiler will select an instance of CanBuildFrom from the GenTraversableFactory class, which is described as follows:

 class GenericCanBuildFrom[A] extends CanBuildFrom[CC[_], A, CC[A]]

and returns a collection of the same type but with elements received from the user function: CC [A] . In other cases, the compiler may choose a more suitable type of result, for example, for strings:

 scala> "abc".map(_.toUpper) // Predef.StringCanBuildFrom res3: String = ABC scala> "abc".map(_ + "*") // Predef.fallbackStringCanBuildFrom[String] res4: scala.collection.immutable.IndexedSeq[String] = Vector(a*, b*, c*) scala> "abc".map(_.toInt) // Predef.fallbackStringCanBuildFrom[Int] res5: scala.collection.immutable.IndexedSeq[Int] = Vector(97, 98, 99)

In the first case, StringCanBuildFrom is selected, the result is String :

 implicit val StringCanBuildFrom: CanBuildFrom[String, Char, String]

In the second and third - the fallbackStringCanBuildFrom method, the result is IndexedSeq :

 implicit def fallbackStringCanBuildFrom[T]: CanBuildFrom[String, T, immutable.IndexedSeq[T]]

Using breakOut

Consider using the Map class. This type of collection is easy to convert to Iterable , if you return from the conversion function not a pair, but a single value:

 scala> Map(1 -> "a", 2 -> "b", 3 -> "c").map(_._2) res6: scala.collection.immutable.Iterable[String] = List(a, b, c)

But to get the Map from the list of pairs you need to call the toMap method:

 scala> List('a', 'b', 'c').map(x => x.toInt -> x) res7: List[(Int, Char)] = List((97,a), (98,b), (99,c)) scala> List('a', 'b', 'c').map(x => x.toInt -> x).toMap res8: scala.collection.immutable.Map[Int,Char] = Map(97 -> a, 98 -> b, 99 -> c)

Or use the breakOut method instead of an implicit parameter:

 scala> import collection.breakOut import collection.breakOut scala> List('a', 'b', 'c').map(x => x.toInt -> x)(breakOut) res9: scala.collection.immutable.IndexedSeq[(Int, Char)] = Vector((97,a), (98,b), (99,c))

The method, as the name suggests, allows you to "break out" of the boundaries of the type of the original collection and give the compiler more freedom to choose the CanBuildFrom instance:

 def breakOut[From, T, To](implicit b: CanBuildFrom[Nothing, T, To]): CanBuildFrom[From, T, To]

From the description it can be seen that breakOut does not specialize any of the three parameters, which means it can be used instead of any CanBuildFrom instance. BreakOut itself implicitly accepts a CanBuildFrom object, but the From parameter in this case is replaced with Nothing , which allows the compiler to use any available CanBuildFrom instance (this happens because the From parameter is declared as contravariant, and the Nothing type is a descendant of any type.)

In other words, breakOut provides an additional “layer” that allows the compiler to choose from all the available implementations of CanBuildFrom , and not just those that are valid for the type of the original collection. In the example above, this makes it possible to use CanBuildFrom from the Map companion, despite the fact that we initially worked with the List . Another example is getting a string from a list of characters:

 scala> List('a', 'b', 'c').map(_.toUpper) res10: List[Char] = List(A, B, C) scala> List('a', 'b', 'c').map(_.toUpper)(breakOut) res11: String = ABC

The implementation of CanBuildFrom [String, Char, String] is declared in Predef and therefore takes precedence over declarations in the companion collections.

An example of using the Future list

As a small example of using CanBuildFrom, we will write an implementation that will automatically compile the Future list into one object, as Future.sequence does:

 List[Future[T]] -> Future[List[T]]

First, let's take a look inside CanBuildFrom . Trait declares two abstract apply methods that return a new collection builder based on the results of a user-defined function:

 def apply(): Builder[Elem, To] def apply(from: From): Builder[Elem, To]

Therefore, to provide your own implementation of CanBuildFrom , you need to prepare and Builder , in which you implement methods for adding an element, clearing the buffer, and getting the result:

 class FutureBuilder[A] extends Builder[Future[A], Future[Iterable[A]]] { private val buff = ListBuffer[Future[A]]() def +=(elem: Future[A]) = { buff += elem; this } def clear = buff.clear def result = Future.sequence(buff.toSeq) }

The implementation of CanBuildFrom itself is trivial:

 class FutureCanBuildFrom[A] extends CanBuildFrom[Any, Future[A], Future[Iterable[A]]] { def apply = new FutureBuilder[A] def apply(from: Any) = apply } implicit def futureCanBuildFrom[A] = new FutureCanBuildFrom[A]

Checking:

 scala> Range(0, 10).map(x => Future(x * x)) res12: scala.concurrent.Future[Iterable[Int]] = scala.concurrent.impl.Promise$DefaultPromise@360e2cfb

Everything is working! Thanks to the futureCanBuildFrom method , we got directly Future [Iterable [Int]] , i.e. the transient collection was converted automatically.

Warning: this is just an example of using CanBuildFrom , I’m not saying that such a solution should be used in your combat code or that it is better than normal wrapping in Future.sequence . Be careful and do not copy the code into your project without first analyzing the consequences!

Conclusion

Using CanBuildFrom is closely related to implicit parameters, so a clear understanding of the logic of choice of values will save you from losing time during debugging - do not be lazy to look into the specification of the language or the Scala FAQ . The compiler can also help to show which implicit values were chosen if you build a program with the -Xprint flag : typer - this saves a lot of time.

CanBuildFrom is a very specific thing and you will most likely not have to work closely with it unless you are developing new data structures. However, an understanding of the principles of its work will not be superfluous and will allow a better understanding of the internal structure of the standard library.

That's all, thanks and success in learning Scala!

Corrections and additions to the article, as always, are welcome.

Source: https://habr.com/ru/post/326116/

All Articles

The subtleties of Scala: we study CanBuildFrom

How it works

Using breakOut

An example of using the Future list

Conclusion

More articles: