Unallocated Final Interpreters (Tagless Final interpreters - approx. Lane ) is an alternative approach to traditional Algebraic Data Types (and generalized ADT), based on the implementation of the "interpreter" pattern. This text represents the "unpartitioned final" approach in Scala, and demonstrates how Dotty with its recently added types of implicit functions makes this approach even more attractive. All code examples are direct translations of their Haskell versions presented in Typed Tagless Final Interpreters: Lecture Notes (section 2).
The "interpreter" pattern has recently attracted more and more attention in the Scala community. Much effort has been expended in fighting the most glaring lack of ADT / GADT-based solutions: extensibility. To begin with, you can look at the typeclass Inject
from cats as a realization of the ideas of the Data Type à la Carte . The Freek library provides tools for combining more than two algebras, using decorations using the type-level operations apparatus. The solution proposed by Freer Monads, More Extensible Effects, also emphasizes extensibility, and is inspired by a set of small Scala libraries, such as eff , emm, and paperdoll . Unallocated final interpreters approach in some sense from the opposite side, using class types in their immediate basis instead of more traditional ADT / GADT. They also come with a great advantage in out-of-box extensibility without any obvious hazards.
Synopsis goes into the details of the presentation and comparison of approaches using ADT / GADT and class types, designating the first as the initial and the second as the final . For the sake of brevity, this text is devoted mainly to final interpreters.
We will work with simple mathematical expressions similar to those used in calculators. Our task is not only to compute such expressions, but also to serialize them, deserialize and simplify them. From the point of view of a lazy engineer, it is perfectly reasonable to present the domain using the (embedded) domain-specific language: among other things, it saves us from having to track possibly incorrect representations of our domain — the basic programming language take care of this for us. Our domain will consist only of integer literals, addition and capture with the opposite sign. The encoding that originated in your mind may look like this:
sealed trait IExp final case class Lit(i: Int) extends IExp final case class Neg(e: IExp) extends IExp final case class Add(r: IExp, l: IExp) extends IExp
The mathematical expression 8 - (1 + 2)
, for example, can be encoded as a value of type IExp
:
val fe: IExp = Add(Lit(8), Neg(Add(Lit(1), Lit(2))))
So far, everything looks simple, right? The interpretation of fe
as an integer will use a recursive function with type IExp => Int
, serialization occurs with the function IExp => Json
, deserialization goes in the opposite direction with Json => Option[IExp]
, and transformations occur through the functions IExp => IExp
.
In terms of the abstract, the IExp
data IExp
corresponds to the initial encoding of our subject area. For now, you can forget about this, because instead we will use the final encoding:
trait Exp[T] { def lit(i: Int): T def neg(t: T): T def add(l: T, r: T): T }
How do we represent 8 - (1 + 2)
using Exp
? Somehow like:
def tf0[T](implicit e: Exp[T]): T = e.add(e.lit(8), e.neg(e.add(e.lit(1), e.lit(2))))
In Haskell (with the appropriate language extension), tf0
can be a polymorphic value. In Scala, we use a function with a type parameter T
and a constraint implicit Exp[T]
. The syntax can be simplified (for example) by getting rid of e.
using auxiliary functions:
object ExpSyntax { def lit[T](i: Int) (implicit e: Exp[T]): T = e.lit(i) def neg[T](t: T) (implicit e: Exp[T]): T = e.neg(t) def add[T](l: T, r: T)(implicit e: Exp[T]): T = e.add(l, r) } import ExpSyntax._ // def tf1[T: Exp]: T = add(lit(8), neg(add(lit(1), lit(2))))
At this moment, you are probably wondering, "how do we write an interpreter for this tf1
?" . The answer is simple: defining an implementation of Exp
!
implicit val evalExp: Exp[Int] = new Exp[Int] { def lit(i: Int): Int = i def neg(t: Int): Int = -t def add(l: Int, r: Int): Int = l + r }
implicit val printExp: Exp[String] = new Exp[String] { def lit(i: Int): String = i.toString def neg(t: String): String = s"(-$t)" def add(l: String, r: String): String = s"($l + $r)" }
Interpretation is done through the definition of types. Let's look at tf1
as Int
:
scala> tf1[Int] res0: Int = 5
What about tf1
as a String
?
scala> tf1[String] res1: String = (8 + (-(1 + 2)))
What if we decide to supplement our mathematical expressions with a product operation? In the initial (ADT-based) encoding of IExp
, we would face two burdensome alternatives: update the definitions of the IExp
data type IExp
with all the interpreters that we have written so far, or rely on putting IExp
values into abstract sums of types (coproducts - . lane ) in Data Type à la Carte style. And in this respect, where the final unpartitioned approach truly manifests itself, since it can be expanded in a natural way without changing (or even recompiling) the written code. We will introduce a new, completely independent class of types for the operation of the product:
trait Mult[T] { def mul(l: T, r: T): T } object MultSyntax { def mul[T](l: T, r: T)(implicit e: Mult[T]): T = e.mul(l, r) } import MultSyntax._
Expressions that use the multiplication of numbers require additional constraints Mult
(additional implicit (implicit - lane ) argument Mult[T]
, that's all). This is how we define tfm1 = 7 - 1 * 2
and tfm2 = 7 * (8 - (1 + 2))
:
def tfm1[T: Exp : Mult] = add(lit(7), neg(mul(lit(1), lit(2)))) def tfm2[T: Exp : Mult] = mul(lit(7), tf1)
If you are not satisfied with adding everywhere : Exp : Mult
, looking ahead, I will say what we will see at the end of the article: in Dotty this can be rendered into the type of an implicit function ((implicit-function-type - approx. Lane ).
To interpret these new expressions, we must provide the Mult
implementations for Int
and String
:
implicit val evalMult: Mult[Int] = new Mult[Int] { def mul(l: Int, r: Int): Int = l * r } implicit val printMult: Mult[String] = new Mult[String] { def mul(l: String, r: String): String = s"$l * $r" }
Without any additional binding, the Exp
and Mult
implementations are automatically combined during interpretation:
scala> tfm1[String] res2: String = (7 + (-1 * 2)) scala> tfm1[Int] res3: Int = 5 scala> tfm2[String] res4: String = 7 * (8 + (-(1 + 2))) scala> tfm2[Int] res4: Int = 35
We now move on to the more difficult task of deserialization. The target format is a Json-like tree structure, defined as follows:
sealed trait Tree final case class Leaf(s: String) extends Tree final case class Node(s: String, ts: List[Tree]) extends Tree
Transforming expressions into this Json-like format is as simple as serializing to String
. Depending on where the Exp
and Mult
implementations are defined, they can be grouped together:
implicit val toTree: Exp[Tree] with Mult[Tree] = new Exp[Tree] with Mult[Tree] { def lit(i: Int): Tree = Node("Lit", List(Leaf(i.toString))) def neg(t: Tree): Tree = Node("Neg", List(t)) def add(l: Tree, r: Tree): Tree = Node("Add", List(l , r)) def mul(l: Tree, r: Tree): Tree = Node("Mult", List(l , r)) }
scala> val tf1Tree = tf1[Tree] tf1Tree: Tree = Node(Add,List(Node(Lit,List(Leaf(8))), Node(Neg,List(Node(Add,List(Node(Lit,List(Leaf(1))), Node(Lit,List(Leaf(2)))))))))
To deserialize, we need to describe the fromTree
function to convert a Json-like Tree
to its final encoding. Given this, our course- encoded values are functions of the type [T] => Exp[T] => T
(in Dotty, this is the syntax for lambda function types ({type L[T] = Exp[T] => T})#L
), our first guess is the definition of fromTree
as def fromTree[T](t: Tree)(implicit e: Exp[T]): Either[ErrMsg, T]
:
type ErrMsg = String def readInt(s: String): Either[ErrMsg, Int] = { import scala.util.{Try, Success, Failure} Try(s.toInt) match { case Success(i) => Right(i) case Failure(f) => Left(f.toString) } } def fromTree[T](t: Tree)(implicit e: Exp[T]): Either[ErrMsg, T] = t match { case Node("Lit", List(Leaf(n))) => readInt(n).right.map(e.lit) case Node("Neg", List(t)) => fromTree(t).right.map(e.neg) case Node("Add", List(l , r)) => for(lt <- fromTree(l).right; rt <- fromTree(r).right) yield e.add(lt, rt) case _ => Left(s" $t") }
This will work, but since T
and Exp[T]
must be fully defined when calling fromTree
, polymorphism is lost and the result fromTree
may have a single interpretation. We circumvent this problem by wrapping the result using the following approach:
trait Wrapped { def value[T](implicit e: Exp[T]): T }
The abstract further suggests rewriting fromTree
using the new signature: def fromTree(t: Tree): Either[ErrMsg, Wrapped]
, but I suppose they lost sight of the fact that we can get the same result by defining the implementation of Exp[Wrapped]
and using our original code for fromTree
:
implicit val wrappingExp: Exp[Wrapped] = new Exp[Wrapped] { def lit(i: Int) = new Wrapped { def value[T](implicit e: Exp[T]): T = e.lit(i) } def neg(t: Wrapped) = new Wrapped { def value[T](implicit e: Exp[T]): T = e.neg(t.value) } def add(l: Wrapped, r: Wrapped) = new Wrapped { def value[T](implicit e: Exp[T]): T = e.add(l.value, r.value) } }
This is enough to get first class polymorphism.
scala> fromTree[Wrapped](tf1Tree) match { | case Left(err) => | println(err) | | case Right(t) => | println(t.value[Int]) | println(t.value[String]) | println |} 5 (8 + (-(1 + 2)))
But we managed only half the task: our deserializer still lacks extensibility. In order to allow adding post factum multiplication, fromTree
must be rewritten in an open-recursive style. This is another scary name for a very simple idea: we can rewrite all our recursive calls fromTree
via the additional parameter recur
:
// , `recur` `fromTree _` ! def fromTreeExt[T] (recur: => (Tree => Either[ErrMsg, T])) (implicit e: Exp[T]) : Tree => Either[ErrMsg, T] = { val e = implicitly[Exp[T]] tree => tree match { case Node("Lit", List(Leaf(n))) => readInt(n).right.map(e.lit) case Node("Neg", List(t)) => recur(t).right.map(e.neg) case Node("Add", List(l , r)) => for(lt <- recur(l).right; rt <- recur(r).right) yield e.add(lt, rt) case t => Left(s" $t") } }
The recursion is then tightened using a fixed point operator (fix point operator - "approx. Lane"):
def fix[A](f: (=> A) => A): A = f(fix(f))
def fromTree2[T: Exp](t: Tree): Either[ErrMsg, T] = fix(fromTreeExt[T] _)(t)
Thus, it becomes possible to determine the deserializer of the product statement independently, and "tighten the node" again:
def fromTreeExt2[T] (recur: => (Tree => Either[ErrMsg, T])) (implicit e: Exp[T], m: Mult[T]) : Tree => Either[ErrMsg, T] = { case Node("Mult", List(l , r)) => for(lt <- recur(l).right; rt <- recur(r).right) yield m.mul(lt, rt) case t => fromTreeExt(recur).apply(t) }
def fromTree3[T: Exp : Mult](t: Tree): Either[ErrMsg, T] = fix(fromTreeExt2[T] _)(t)
Now we can test our serialization and deserialization for any e
, for example, using:
assert(fromTreeN[String](e[Tree]) == Right(e[String]))
Notice that in Scala this implementation is stack unsafe. We need to add trampolining to use this recursion trick for large data structures.
We saw how to transform our mathematical expressions into various other representations, i.e. interpretation; how to build our mathematical expressions from another representation: deserialization; but what about converting a mathematical expression to another mathematical expression?
Consider a transformation that will lower the unary minus to the bottom, to the literals, so that 8 - (1 + 2)
becomes 8 + ((-1) + (-2))
. The task sounds just for the initial (ADT-based) encoding, a simple function IExp => IExp
def pushNeg(e: IExp): IExp = e match { case Lit(_) => e case Neg(Lit(_)) => e case Neg(Neg(n)) => n case Neg(Add(l, r)) => Add(pushNeg(Neg(l)), pushNeg(Neg(r))) case Add(l, r) => Add(pushNeg(l), pushNeg(r)) }
It looks impossible in the final encoding. Pattern matching (pattern matching) is very convenient for conversions in a specific context, how do we achieve the same convenience inside the Exp
implementation? The trick is that instead of handling Exp[T]
as we did before, working with Exp[Ctx => T]
in the appropriate context. In this case, the context is quite simple: all we need to know is whether or not to change the sign of the current node:
sealed trait NCtx final case object PosCtx extends NCtx final case object NegCtx extends NCtx
The conversion is expressed as Exp[NCtx => T]
:
implicit def negDownExp[T](implicit e: Exp[T]): Exp[NCtx => T] = new Exp[NCtx => T] { def lit(i: Int): NCtx => T = { case PosCtx => e.lit(i) case NegCtx => e.neg(e.lit(i)) } def neg(x: NCtx => T): NCtx => T = { case PosCtx => x(NegCtx) case NegCtx => x(PosCtx) } def add(l: NCtx => T, r: NCtx => T): NCtx => T = c => e.add(l(c), r(c)) }
To apply a transformation, you first need to convert the expression to NCtx NCtx => T
, and then call it with the initial context:
scala> tf1[NCtx => String].apply(PosCtx) (8 + ((-1) + (-2)))
The initial context can also be rendered in function:
scala> def pushNeg[T](e: NCtx => T): T = e(PosCtx) pushNeg: [T](e: NCtx => T)T scala> pushNeg(tf1[NCtx => String]) (8 + ((-1) + (-2)))
Unfortunately, the scalac
type inference mechanism in this case requires explicitly defining an internal type parameter, which can look rather ugly when composing several transformations: pushNeg(pushNeg(pushNeg(tf1[NCtx => NCtx => NCtx => String])))
. Improvements in the type inference mechanism in Dotty make it possible to write just pushNeg(pushNeg(pushNeg(tf1))): String
, just like you would write in Haskell. See Dotty and types for the story so far for an introduction to the Dotty type inference engine.
This transformation can be naturally extended for a product operation using the definition of the additional implementation Mult[NCtx => T]
:
implicit def negDownMult[T](implicit e: Mult[T]): Mult[NCtx => T] = new Mult[NCtx => T] { def mul(l: NCtx => T, r: NCtx => T): NCtx => T = { case PosCtx => e.mul(l(PosCtx), r(PosCtx)) case NegCtx => e.mul(l(PosCtx), r(NegCtx)) } }
scala> pushNeg(tfm1[NCtx => String]) (7 + 1 * (-2)) scala> pushNeg(tfm2[NCtx => String]) 7 * (8 + ((-1) + (-2)))
This lecture continues with another example of conversion using a similar context introduction trick. The transformations that we have seen so far have been quite inventive, and you may ask yourself: whether everything that can be written using the final encoding can also be expressed in the initial encoding, and vice versa. These two representations are actually equivalent, which can be demonstrated by the existence of a bijection:
// ADT implicit def initialize: Exp[IExp] = new Exp[IExp] { def lit(i: Int): IExp = Lit(i) def neg(t: IExp): IExp = Neg(t) def add(l: IExp, r: IExp): IExp = Add(l, r) } // ADT def finalize[T](i: IExp)(implicit e: Exp[T]): T = i match { case Lit(l) => e.lit(l) case Neg(n) => e.neg(finalize[T](n)) case Add(l, r) => e.add(finalize[T](l), finalize[T](r)) }
Implicit functions are a recent addition to the Dotty compiler. The idea is to extend the syntax currently available with support for functions with implicit arguments. As you probably know, functions in Scala are defined as follows:
trait Function1[A, B] { def apply(a: A): B }
The compiler implements syntactic sugar, which converts the types A => B
to Function1[A, B]
and allows users to concisely define function values. Implicit functions are similar: implicit A => B
becomes the correct type, which appears in ImplicitFunction1[A, B]
:
trait ImplicitFunction1[A, B] { def apply(implicit a: A): B }
The definition of a function that returns the type of an implicit function receives an additional advantage in the additional expansion, which automatically places implicit parameters in scope:
def f: implicit Ctx => Unit = ???
Expands to:
def f: implicit Ctx => Unit = { implicit $e1: Ctx => ???: Unit }
Syntactic sugar may not look very useful in this simple example, but with such a synonym, the type is becoming more interesting:
type Contextualized[T] = implicit Ctx => T def f: Contextualized[Unit] = ???
If more than one implicit parameter is involved, the type of implicit function allows you to do something that was not quite possible earlier: to abstract from implicit parameters.
type Constrained[T] = implicit (TC1[T], TC2[T], TC3[T]) => T def f: Constrained[Int] = ???
Expands to:
def f: Constrained[Int] = { ($e1: TC1[Int], $e2: TC2[Int], $e3: TC3[Int]) => implicit $e4: TC1[Int] = $e1 implicit $e5: TC1[Int] = $e2 implicit $e6: TC1[Int] = $e3 ???: Int }
Returning to our final coding of mathematical expressions, the implicit types of functions in Dotty make it possible to extend the encoding with a minimal syntactic overhead:
type Ring[T] = implicit (Exp[T], Mult[T]) => T def tfm1[T]: Ring[T] = add(lit(7), neg(mul(lit(1), lit(2)))) def tfm2[T]: Ring[T] = mul(lit(7), tf1)
This is the end of our return to the Unselected Final Interpreters (see here all (Scala 2.11) code snippets from this article)!
If you want to learn more about the Uninterpreted Finite Interpreters, I strongly recommend that you continue to study sections 3 and 4 of the abstract for coding simply typed lambda calculus.
Source: https://habr.com/ru/post/325874/
All Articles