When the word "polymorphism" immediately recall object-oriented programming, in which polymorphism is one of the pillars ( Polymorphism for beginners ). (And, apparently, more important than the other pillars.) It turns out that you can achieve a similar effect in a different way, which in some cases is more preferable. For example, using classes of types, you can assign new features to already existing types that cannot change the ancestor, or, using a data type with incompatible classes, "solve" the problem of multiple inheritance.
On Habré there are already several publications that give an idea of the type classes:
Comparator[T]
.In this post I would like to reflect the view on the classes of types as one of the everyday tools of programmers. Moreover, if Haskell doesn’t do without them at all, then in Scala you can perfectly tolerably live, not knowing about their existence. However, upon closer inspection, it turns out that such a tool is very useful when writing reusable programs. In addition, there are a number of universal libraries that are based on this tool, and to use them, you also need to understand the types of classes.
In functional programming, immutable data types have gained wide popularity. Since the data cannot arbitrarily change, there is no reason to hide them, and instead of hiding the data, now open types are used, where the data is public. (Thus, among the three pillars of OOP — polymorphism, inheritance, and encapsulation — one is somewhat pushed aside.)
Freely available data allows the use of external processing algorithms. There is no need to bind algorithms to the object itself and to overcome artificial barriers between objects if the processing algorithm uses several objects of different types.
It turns out that a significant part of the set of data structures can be modeled using only two mechanisms ( Algebraic data types ). First, the creation of records or tuples ("type-product"). Secondly, the creation of alternative implementations of the same parent type - enum, interface inheritance, sealed trait - "type-sum".
Example:
// - : sealed trait Form object Form { // - String X String case class Form1(number: String, title: String) extends Form // - UUID X String X Int case class Form2(id: UUID, documentation: String, count: Int) extends Form // == Unit ( ) case object BadForm extends Form }
As you can see, such algebraic data types do not imply data hiding, and the presence of any built-in methods. All data is directly accessible, and we can process these types of external algorithms, for example, such:
All these types of processing are implemented separately and do not have to know anything about each other. In such algorithms, pattern-matching is the main way of processing different subtypes with access to real data. Using pattern-matching, we simultaneously check which particular subtype an object is, and extract the values of the fields of interest.
Placing algorithms outside specific types has the following advantages:
Suppose we implemented some algorithm outside our data types. If this algorithm directly uses our types, then we will not be able to reuse it for other similar data. This, on the one hand, is not bad, since such an algorithm is easier to write, but, on the other hand, its generality is limited. This means, in general, that the algorithm will be used less frequently, and, apparently, it will be worse tested (with the same total economic costs), or the cost of support will be higher.
Therefore, I would like to have mechanisms that allow us to generalize our algorithm to other data types (existing and prospective). This will allow using the same algorithm in many cases and will recoup the costs of its development and testing.
OOP proposes to allocate a common interface for "similar" data and implement the algorithm in terms of this common interface. In concrete classes that inherit this interface, we just need to implement these common methods. Thus, we obtain to some extent a polymorphic algorithm. However, these operations included in the interface of "similar" data, we need to implement in the data itself.
Type classes represent the next step in isolating code that plays different roles in a program. The operations we want to perform on the data are moved to a separate class that is not an ancestor of the data. An instance of this class for this data type is passed to the algorithm along with the data.
Example:
Suppose that in the algorithm of interest to us we use data comparison in order. Such a comparison can be represented by a function.
def compare[T](a: T, b: T): Int
Put this function in the class of the Ordering
type:
trait Ordering[T] { def compare(a: T, b: T): Int }
Now let our universal algorithm do the sorting. It must accept the type of data it works with:
def sort[T](list: List[T]): List[T]
since elements are compared within the algorithm, an algorithm must be passed to the algorithm for our type T
:
def sort[T: Ordering](list: List[T]): List[T]
either, which is the same:
def sort[T](list: List[T])(implicit o: Ordering[T]): List[T]
The algorithm, if it is necessary to invoke the compare
operation, should obtain an instance of the type class using implicitly[Ordering[T]].compare(a,b)
.
We only need to provide an instance of the type class for our data type:
implicit object FormOrdering extends Ordering[Form] { def compare(a: Form, b: Form): Int = (a,b) match { case (Form1(numberA, titleA), Form1(numberB, titleB)) => numberA - numberB case (BadForm, BadForm) => 0 ... case _ => 0 } }
Thus, we achieve a common algorithm without cluttering our data with pieces of code related to a specific algorithm.
How to make methods accessible directly to the type itself? For example, we would like to compare the objects inside the algorithm using the method a compare b
, without explicitly calling a method of the type class.
To do this, use the usual Scala pimp-my-library mechanism:
implicit class OrderingOps[T:Ordering](a:T){ def compare(b:T): Int = implicitly[Ordering[T]].compare(a,b) }
Thus, for all types for which there is an instance of Ordering
, a new "method" will appear.
If such a desire arises every time, then you can use the library simulacrum , which creates such an auxiliary method with all the necessary strapping using macros:
import simulacrum.typeclass @typeclass trait Ordering[T]{ def compare(a: T, b: T): Int }
Consider an example of a type class for a custom data structure. One of the mechanisms used to optimize programs is to rewrite AST with preserving semantics. In this case, all nodes of the tree are traversed (in depth or in width), and for each node, a matching pattern is searched (pattern matching) and in the case of pattern matching, the node is rewritten according to the appropriate rule.
For different tasks (equations, programs), the types that make up the AST tree are different, the matching / optimization patterns are also different. However, the traversal algorithm is the same.
This algorithm is a contender for abstraction using type classes. To an arbitrary type of tree, we must add some operations used in the tree traversal algorithm.
import simulacrum.typeclass @typeclass trait RewritableTree[T] { def children(node: T): List[T] def replaceChildren(node: T, newChildren: List[T]): T }
object RewritableTree { def rewrite[T: RewritableTree](f: PartialFunction[T, T]): T => T = t => { rewrite0(f)(t).getOrElse(t) } private def rewrite0[T: RewritableTree](f: PartialFunction[T, T])(t: T): Option[T] = { import RewritableTree.ops._ // "", simulacrum' val rt = implicitly[RewritableTree[T]] - "" val children = t.children // rt.children(t) var changed = false // , , , val updatedChildren = children.map{child => val res = rewrite0(f)(child) changed = changed || res.isDefined res.getOrElse(child) } // //def rewriteList(lst: List[T], result: mutable.ListBuffer[T], changed: Boolean): (List[T], Boolean) = lst match { // case Nil => (result.toList, changed) // case head :: tail => // val res = rewrite0(f)(head) // rewriteList(tail, result.append(res.getOrElse(head)), changed || res.isDefined) //} //val (updatedChildren, changed) = rewriteList(t.children, mutable.ListBuffer(), false) val updatedTree = if(changed) t.replaceChildren(updatedChildren) else t var changed2 = true val updatedTree2 = f.applyOrElse(t1, (_:T) =>{changed2 = false; updatedTree}) if(changed || changed2) Some(updatedTree2) else None } }
Using the same type class, you can implement the collect
method that collects any values as you traverse the tree.
Suppose we have already implemented a class of type Ordering[T]
for our type T
And we would like to sort the Option[T]
list. Can we use the already implemented type class and simply add the missing functionality?
This can be done if we provide an implementation of a type class on the fly, constructing an implementation of the existing type classes.
implicit def optionOrdering[T:Ordering]: Ordering[Option[T]] = new Ordering[Option[T]] { def compare(a: Option[T], b: Option[T]): Int = (a, b) match { case (Some(c), Some(d)) => implicitly[Ordering[T]].compare(c,d) case (None, None) => 0 case (_, None) => 1 case (None, _) => -1 } }
Such an implementation is automatically inserted into the sorting algorithm for any types for which there is an instance of the class of the type Ordering[T]
.
Similarly, you can construct type classes for any generic types, such as, List[T]
, Tuple2[A,B]
, ...
The operations that are declared inside the type classes can be anything. For our algorithm, we can draw an abstraction boundary arbitrarily enough, for example, put the entire algorithm in a type class, or vice versa, put data access methods directly in a type class. Both of these boundary options are not optimal in terms of reuse. Therefore, it is worthwhile to place the minimum number of operations into types classes that are easy to implement for other types, and our algorithm can be expressed through these operations.
As soon as we begin to look at algorithms and access to data from this point of view, we are more likely to come to some frequently used type classes.
The Scala standard library has several type classes: Ordering[T]
, Equiv[T]
, Numeric[T]
, Integral[T]
, ...
The typelevel / cats library (as well as the scalaz library) has declared several additional classes for simple types with frequently used operations ( http://typelevel.org/cats/typeslasses.html ):
Semigroup
) is a single operation combine
.empty
.For example, for numbers, the operation combine
can be defined as the sum of numbers; in this case, the zero element is the usual zero. We get an additive monoid. You can also use a multiplicative monoid if you take multiplication as a combine
operation, and empty
as a unit. The list of numbers can also be viewed as a monoid. In the role of the operation combine
will be gluing lists, and the zero element is an empty list.
Example:
You can use a monoid to implement accumulation. Create a state with an initial value of empty
from the monoid. Further, when data is received at the input, we can combine
with those that are already in a state. For example, we can take the type Int
with the operation "sum". In this case, the sum of the incoming data will be accumulated in one value. Or take a monoid for List[T]
. In this case, all data will be available in this list (there should be lists at the entrance, or each number should be wrapped in a list). In both cases, the accumulation algorithm is identical - call the combine
method for existing data and new data. And the algorithm does not depend on the specific type with which it works.
Also, if we know about some type that it is a monoid (i.e., there is an instance of the monoid class for this type), then we can use foldLeft
, a convolution for the collection of these elements (we do not need to re-implement it).
In addition to simple base types, type classes can be used to work with types that themselves have parameters. (Thus, a type class requires support for higher order types in the language.) Higher order types are characterized by a kind of "type type":
*
(for example, Int
, String
);* -> *
(for example, List[T]
, Option[T]
, Future[T]
);* -> * -> *
(for example, the Function1[A,B]
function Function1[A,B]
). (Please note that the function itself contains one arrow A => B
, and at the type level A => B => (A=>B)
two arrows (the third arrow is already inside the type itself).)In the cats library, in addition to the type classes that work with base types, there are type classes that are used when working with type constructors. In particular, for types * -> *
:
A functor is a type class that contains one operation - map
. An operation accepts an object, for example, of the List[Int]
and applies the specified function to each element. For List
and Option
, this operation, generally speaking, is already implemented in the data type itself, and one could not create a type class. However, if we want to implement universal algorithms using the map
operation, then such a type class is necessary.
flatMap
operation, or bind
, or >>=
(and also flatten
, map
, pure
). This type of class seems to be the most famous. Its advantage is due to the fact that flatMap
( bind
) is a fairly universal way to glue successive calculations. On the operation flatMap
even based "sugar" in Scala - for-comprehension.Example:
val allChildren = objects.flatMap(_.children)
val street = personOpt.flatMap(_.addressOpt).flatMap(_.streetOpt)
DataTable[T]
. Using flatMap
we can define a subquery that retrieves data from the results of this query. Such a query can be glued to the original query without performing the first query and not working with the collection of results . We can compile the glued query in SQL and send it to the database for execution on the DBMS side. This approach is implemented, for example, in the Slick library.For types * -> * -> *
in the cats library there is also a type class:
compose
+ "zero" element - identity
. The type for which a category type class is defined is called an Arrow. Arrows are like functions. In particular, for normal functions, the compose
operation corresponds to the andThen
method, and the identity
operation corresponds to the identity
function.Examples of categories:
Example:
For categories, the key feature is compose
. Those. if our algorithm can be expressed in terms of compose, then we can apply this algorithm for any categories.
Let us simulate a chain of data transformations using our own DSL. Suppose that each transformation can be represented by some type of Transform[A,B]
.
A
and B
are not necessarily types from our data model. This may be the so-called phantom types . Using phantom types allows you to define your own rules for allowed combinations of transformations that will be checked by the compiler. Those. we cannot use the compose
method for incompatible conversions.
After the user has described his task using this DSL, we can convert our conditional transformations into real actions with data, which are already represented by ordinary functions on real types. The conversion of one category (model functions) into another category (real functions) is called "natural conversion" .
The type classes implemented in the cats library model concepts of category theory. Therefore, the methods of these classes of types must satisfy certain properties (which are described in theory). For example, for a monoid:
a combine empty = a = empty combine a
- the definition of an empty element(a combine b) combine c = a combine (b combine c)
- combine
operation associativityAll properties that are required by category theory for a type class are implemented as "laws" - sets of "properties" of the ScalaCheck library. And it is possible for any instance of a type class to check whether this instance meets the requirements for this type class. Many algorithms rely on these properties, so when you implement type classes for your data, you should check these laws in unit tests.
Once we have verified that our type class implementations satisfy the existing laws, we can be pretty sure that our program is correct, using algorithms from libraries based on these properties of type classes.
Compared to regular interfaces that need to be implemented in descendants, type classes have the following advantages:
empty: T
, or the method parse: String => T
;Option[T]
or for A \/ B
These benefits can be used independently in any program. It is enough to take a different look at the structure of your code.
The cats library (as well as the scalaz library) has a well-organized and tested set of type classes (from category theory), which are used in many algorithms and libraries. , , .
Source: https://habr.com/ru/post/318960/