📜 ⬆️ ⬇️

Great power newtypes

A newtype is a specialized data type declaration. Such that it contains only one constructor and a field.

newtype Foo a = Bar a newtype Id = MkId Word 


Common Beginner Questions


What is the difference from data type data?

 data Foo a = Bar a data Id = MkId Word 

The main specificity of a newtype is that it consists of the same parts as its only field. More precisely, it differs from the original at the type level, but it has the same memory representation, and it is calculated strictly (not lazily).
In short, the newtype is more efficient due to its presentation.
')
Yes, it does not mean anything to me ... I will use data
No, well, in the end, you can always include the -funpack-strict-fields :) extension for strict (not lazy) fields or specify directly

 data Id = MkId !Word 

Yet the power of newtype is not limited to the efficiency of the calculations. They are much stronger!

3 newtype roles




Hiding implementation


 module Data.Id (Id()) where newtype Id = MkId Word 

A newtype is different from the original, internally just Word .
But we hide the MkId constructor outside the module.

Distribution implementation


 {-# LANGUAGE GeneralizedNewtypeDeriving #-} newtype Id = MkId Word deriving (Num, Eq) 

Although this is not in the Haskell2010 standard, thanks to the expansion of the generalized newTypes output, you can automatically infer newtype behavior the same as the internal field behavior. In our case, the behavior of Eq Id and Num Id is the same as the Eq Word and Num Word .

Much more can be achieved through the expansion of the refined derivation ( DerivingVia ), but more on that later.

Implementation of choice


Despite your own constructor, in some cases you can use your own internal representation.

Task


There is a list of integers. Find the maximum and total amount for just one pass through the list.
And do not use the foldl and folds packages .

Typical answer


Of course, fold ! :)

 foldr :: Foldable t => (a -> b -> b) -> b -> ta -> b {- -- instance Foldable [] foldr :: (a -> b -> b) -> b -> [a] -> b -} 

And, the final function is described as:

 aggregate :: [Integer] -> (Maybe Integer, Integer) aggregate = foldr (\el (m, s) -> (Just el `max` m, el + s)) (Nothing, 0) {- ghci> aggregate [1, 2, 3, 4] (Just 4, 10) -} 

If you look closely, you can see similar operations on both sides: Just el `max` m and el + s . In both cases - mapping and binary operation. And the empty elements are Nothing and 0 .

Yes, these are monoids!

Monoid and Semigroup more details
A semigroup is a property of an associative binary operation.

 x ⋄ (y ⋄ z) == (x ⋄ y) ⋄ z 

A monoid is a property of an associative operation (that is, a semigroup)

 x ⋄ (y ⋄ z) == (x ⋄ y) ⋄ z 

which has an empty element that does not change any element either to the right or to the left

 x ⋄ empty == x == empty ⋄ x 


Both max and (+) are associative, both have empty elements - Nothing and 0 .

And the union of the mapping of monoids together with the convolution is the same Foldable !

Foldable more details
Recall the definition of coagulation:

 class Foldable t where foldMap :: (Monoid m) => (a -> m) -> ta -> m ... 


Let's apply the rollover behavior to max and (+) . We will be able to organize no more than one implementation of the Word monoid. It's time to use the newtype option!

 {-# LANGUAGE GeneralizedNewtypeDeriving #-} -- already in Data.Semigroup & Data.Monoid newtype Sum a = Sum {getSum :: a} deriving (Num, Eq, Ord) instance (Num a, Ord a) => Semigroup (Sum a) where (<>) = (+) instance (Num a, Ord a) => Monoid (Sum a) where mempty = Sum 0 newtype Max a = Max {getMax :: a} deriving (Num, Eq, Ord) instance (Num a, Ord a) => Semigroup (Max a) where (<>) = max 

It is necessary to make a remark.

The fact is that in order to be a monoid for the Max a data type, we need a minimum element, that is, for an empty element to exist. So, a monoid can only be a limited Max a .

Theoretically correct monoid of maximal element
 newtype Max a = Max a instance Ord a => Semigroup (Max a) instance Bounded a => Monoid (Max a) 


So somehow we have to convert our data type so that an empty element appears and we can use clotting.

 -- already in Prelude data Maybe a = Nothing | Just a instance Semigroup a => Semigroup (Maybe a) where Nothing <> b = b b <> Nothing = b (Just a) <> (Just b) = Just (a <> b) instance Semigroup a => Monoid (Maybe a) where mempty = Nothing -- ------ instance Functor Maybe where fmap _ Nothing = Nothing fmap f (Just b) = Just (fb) 

The conjugate element Maybe turns a semigroup into a monoid!

Liberalization of restrictions in fresh versions of GHC
Even in GHC 8.2, a monoid was required in the type constraint

 instance Monoid a => Monoid (Maybe a) 

which means we needed another newType:

 -- already in Data.Semigroup & Data.Monoid newtype Option a = Option {getOption :: Maybe a} deriving (Eq, Ord, Semigroup) instance (Ord a, Semigroup a) => Monoid (Option a) where mempty = Option Nothing 

And it is much simpler already in GHC 8.4, where only a semigroup is needed to restrict the type, and even there is no need to create the type Option.

 instance Semigroup a => Monoid (Maybe a) 


Cooldown response


Well, now let's update the code using the rollover and arrows.
Recall that (.) Is just a functional composition:

  (.) :: (b -> c) -> (a -> b) -> a -> c f . g = \x -> f (gx) 

And remember that fmap is a functor:

 fmap :: Functor f => (a -> b) -> fa -> fb 

its implementation for Maybe is described just above.

Arrow more details
Arrows are the properties of some functions that allow you to work with them flowchart.
For more details, see here: Arrows: A General Interface to Computation
In our case, we use the function arrows.
I.e

 instance Arrow (->) 

We will use the functions:

 (***) :: Arrow a => abc -> ab' c' -> a (b, b') (c, c') (&&&) :: Arrow a => abc -> abc' -> ab (c, c') 

For our case
 abc == (->) bc == b -> c 

And, accordingly, the signature of our functions is reduced to:

 (***) :: (b -> c) -> (b' -> c') -> ((b, b') -> (c, c')) (&&&) :: (b -> c) -> (b -> c') -> (b -> (c, c')) 

Or in very simple words, the function (***) combines two functions with one argument (and one output type) into a function with the operation of a pair of arguments at the input and at the output, respectively, a pair of output types.

The function (&&&) is a truncated version (***) , where the type of the input arguments of the two functions is the same, and at the input we do not have a pair of arguments, but one argument.

Total, unifying function acquired the form:

 import Data.Semigroup import Data.Monoid import Control.Arrow aggregate :: [Integer] -> (Maybe Integer, Integer) aggregate = (fmap getMax *** getSum) . (foldMap (Just . Max &&& Sum)) {- -- for GHC 8.2 aggregate = (fmap getMax . getOption *** getSum) . (foldMap (Option . Just . Max &&& Sum)) -} 

It turned out very briefly!

But, it is still tiring to wrap and wrap data from nested types!
You can still cut, and we will help resourceless forced conversion!

Safe non-resource forced conversion and role roles


There is a function from the package Unsafe.Coerce - unsafeCoerce

 import Unsafe.Coerce(unsafeCoerce) unsafeCoerce :: a -> b 

The function forcibly converts the type: from a to b .
In essence, the function is magic, it tells the compiler to consider data of type a as type b , without taking into account the consequences of this step.

It can be used to convert nested types, but you must act very carefully.

In 2014, a revolution occurred with a newtype , namely, a secure resource-free forced conversion appeared!

 import Data.Coerce(coerce) coerce :: Coercible ab => a -> b 

This function has opened a new era in working with newtype .

The coercible force converter works with types that have the same structure in memory. It looks like a class-type, but in fact GHC converts types during compilation and it is impossible to independently determine instances.
The Data.Coerce.coerce function allows nonresource type conversions, but for this we need access to the type constructors.

Now simplify our function:

 import Data.Semigroup import Data.Monoid import Control.Arrow import Data.Coerce aggregate :: [Integer] -> (Maybe Integer, Integer) aggregate = coerce . (foldMap (Just . Max &&& Sum)) -- coerce :: (Maybe (Max Integer), Sum Integer) -> (Maybe Integer, Integer) 

We avoided the routine of pulling out nested types; we did it without wasting resources with just one function.

Roles of nested data types


With the coerce function , we can forcefully convert any nested types.
But should this feature be so widely used?

 -- already in Data.Ord -- Down a - reversed order newtype Down a = Down a deriving (Eq, Show) instance Ord a => Ord (Down a) where compare (Down x) (Down y) = y `compare` x import Data.List(sort) -- Sorted data Sorted a = Sorted [a] deriving (Show, Eq, Ord) fromList2Sorted :: Ord a => [a] -> Sorted a fromList2Sorted = Sorted . sort -- minimum: O(1) ! minView :: Sorted a -> Maybe a minView (Sorted []) = Nothing minView (Sorted (a : _)) = Just a 

Semantically, it is absurd to convert to Sorted a from Sorted (Down a) .
However, you can try:

 ghci> let h = fromList2Sorted [1,2,3] :: Sorted Int ghci> let hDown = fromList2Sorted $ fmap Down [1,2,3] :: Sorted (Down Int) ghci> minView h Just (Down 1) ghci> minView (coerce h :: Sorted (Down Int)) Just (Down 1) ghci> minView hDown Just (Down 3) 

All anything, but the correct answer is Just (Down 3) .
It was in order to cut off the wrong behavior that type roles were introduced.

 {-# LANGUAGE RoleAnnotations #-} type role Sorted nominal 

Let's try now:

 ghci> minView (coerce h :: Sorted (Down Int)) error: Couldn't match type 'Int' with 'Down Int' arising from a use of 'coerce' 

Much better!

In total there are 3 roles ( type role ):


In most cases, the compiler is smart enough to reveal the role of the type, but it can be helped.

Specified Injection DerivingVia Behavior


Thanks to the expansion of the language DerivingVia , the newtype distribution role has improved.

Starting with GHC 8.6, which was recently released, this new extension has appeared.

 {-# LANGUAGE DerivingVia #-} newtype Id = MkId Word deriving (Semigroup, Monoid) via Max Word 

As you can see, the type behavior is automatically derived due to the clarification of how to output.
DerivingVia can be applied to any type that supports Coercible and what's important - completely without the consumption of resources!

Even more, DerivingVia can be applied not only to newtype , but also to any isomorphic types, if they support generics Generics and forced conversion Coercible .

findings


Types newtype is a powerful force that greatly simplifies and improves the code, eliminates the routine and reduces resource consumption.

Original translation : The Great Power of newtypes (Hiromi Ishii)

PS I think, after this article, published more than a year ago [not mine] article The newtype Magic in Haskell about newTypes will become a little clearer!

Source: https://habr.com/ru/post/425405/


All Articles