Subtleties of the value restriction in F #

One of the distinguishing features of the F # language, compared with the more common programming languages, is powerful and comprehensive automatic type inference. Thanks to him, in F # programs you almost never specify types explicitly, type less text, and end up with a more concise, fantastically elegant code.

All materials from the translation series of the Russian-speaking community of F # -developers can be found by the tag #fsharplangru .

Automatic type inference algorithms are an exciting topic; behind them is an interesting and beautiful theory. Today we will look at one curious aspect of automatic type inference in F #, which may give you an idea of what difficulties arise in good modern algorithms of this kind, and hopefully explain one stumbling block that F # programmers face from time to time .

Our topic today will be a limit on values . On MSDN there is a good article on the topic of value constraints and automatic type generalization in F #, which gives very reasonable practical advice if you encounter it in your code. However, this article merely states dryly: " The compiler performs automatic generalization only on full definitions of functions with explicitly specified arguments and on simple immutable values ", and gives almost no explanation for this, which is quite fair, because MSDN is just reference material. This post focuses on the reasoning underlying the restrictions on values - I will answer the question "why?", And not "what to do?".

Automatic generalization is a powerful feature for automatic type inference of F #. Define a simple function — for example, an identity function:

let id x = x

There are no explicit type annotations , but the F # compiler prints the type 'a -> 'a for this function: the function takes an argument of some type and returns the result of exactly the same type. This is not particularly difficult, but note that the F # compiler infers the implicit parameter-type 'a for the function id .

We can combine the id function with the List.map function, which itself is polymorphic :

 let listId l = List.map id l

(Not a very useful feature, but for now, visibility is more important). The F # compiler gives the listId function the correct type 'a list -> 'a list - automatic generalization occurred again. Since List.map is a List.map function, we may be tempted to drop the l argument to the left and right:

 let listId = List.map id

But the F # compiler unexpectedly resents:

 Program.fs(8,5): error FS0030: Value restriction. The value 'listId' has been inferred to have generic type val listId : ('_a list -> '_a list) Either make the arguments to 'listId' explicit or, if you do not intend for it to be generic, add a type annotation.

What's happening?

An MSDN article offers 4 ways to correct the let -def, which is rejected by the compiler due to a restriction on values:

Restrict the type so that it ceases to be polymorphic by adding an explicit type annotation to the value or parameter.
If the problem is caused by a non-generic construct in the definition of a polymorphic function (such as the composition of functions or the partial application of arguments to a curried function), try rewriting the definition of the function to ordinary.
If the problem is an expression that is too complicated to generalize, turn it into a function by adding an unused parameter.
Add explicit polymorphic parameter types. This is rarely used.

The first method is not for us - we want the listId function listId be polymorphic.

The second way will bring us back to the explicit indication of the list parameter - and this is the canonical way of defining functions like listId in the F # language.

The third way is applicable when you need to define something that is not a function, in our case it gives an unconvincing option:

 let listId () = List.map id

In the working code, I would use the second solution and leave the function parameter explicit. But for the sake of learning, let's try the "rarely used" fourth method:

 let listId<'T> : 'T list -> 'T list = List.map id

It compiles and works as expected. At first glance it seems that this is a type inference error - the compiler cannot determine the type, so we added an annotation to help it. But wait, the compiler almost brought out this type - it is also mentioned in the error message (with the mysterious type variable '_a )! As if the compiler was stunned by this particular case - why?

The reason is quite reasonable. To see it, let's consider another case of a value constraint. This reference cell does not compile to the list:

 let v = ref []

 Program.fs(16,5): error FS0030: Value restriction. The value 'v' has been inferred to have generic type val v : '_a list ref Either define 'v' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.

Let's get around this with explicit type annotations:

 > let v<'T> : 'T list ref = ref [] val v<'T> : 'T list ref

The compiler is satisfied. Let's try to assign v some value:

 > v := [1];; val it : unit = ()

True, v now refers to a list with a single element 1 ?

 > let x : int list = !v;; val x : int list = []

Oh! Content v is an empty list! Where is our [1] ?

That's what happened. Our assignment can actually be rewritten as:

 (v<int>):=[1]

The left side of this assignment is the application of v to an argument of type int . And v , in turn, is not a reference cell, but a type function : having received a type argument as input, it will return a reference cell. Our expression creates a new reference cell and assigns it to [1] . Similarly, if we explicitly specify the type argument in the dereference v :

 let x = !(v<int>)

we will see that v again applied to the type argument, and returns a fresh reference cell containing an empty list.

To flesh out a conversation about type functions, let's examine the resulting IL code. If we compile the definition of v , our valid Reflector will show us that v is:

 public static FSharpRef<FSharpList<T>> v<T>(){ return Operators.Ref<FSharpList<T>>(FSharpList<T>.get_Empty()); }

What we perceive as a value in F # is actually a generic, parameterless method in the corresponding IL code. Both assignment and dereference v call the IL method, which will return a new reference cell each time.

However, nothing in the expression

 let v = ref []

does not hint at such behavior. The name v looks like an ordinary value, and not at all a method or even a function. If such a definition were allowed, F # developers would expect an unpleasant surprise. That's why here the compiler stops displaying polymorphic parameters - the restriction on values protects you from unexpected behavior.

So, when is automatic summarization safe? It is difficult to give exact criteria, but one simple answer suggests itself: a generalization is safe when the right part of a let expression is at the same time:

Does not contain side effects (in other words, clean )
Returns an immutable object

Indeed, the bizarre behavior of v arises from the variability of the reference cell; precisely because the reference cell is changeable, it was important to us whether one or different cells would be obtained as a result of different calls to v . If the right part of the let expression does not contain side effects, we know that we always get equivalent objects, and since they are immutable, we don’t care if we get the same or different copies of them as a result of various calls.

From the point of view of the compiler, it is difficult, even impossible, to determine whether the above conditions are fulfilled. Therefore, the compiler uses a simple and crude, but natural and understandable approximation: it generalizes the definition only when it can deduce purity and immutability from the syntactic structure of the expression on the right side of let . Therefore, an expression for which the original definition of let listId l = List.map id l is syntactic sugar

 let listId = fun l -> List.map id

generalized - in the right part of the closure; the creation of the circuit does not contain side effects and the circuit is immutable.

Similarly with marked up associations:

 let q = None let z = []

and unchangeable entries:

 type 'ar = { x : 'a; y : int } let r1 = { x = []; y = 1 }

Here r1 is of type 'a list r . However, if you try to initialize any fields of an immutable record by the result of a function call:

 let gen = let u = ref 0 fun () -> u := !u + 1; !u let f = { x = []; y = gen() }

then the value of f will not be generalized. In the example above, gen is an unconditionally dirty function; it could be clean, but the compiler cannot know about it, so it returns an error as a precaution. For the same reason

 let listId = List.map id

does not generalize - the compiler does not know if the pure List.map function List.map or not.

Expressions for which the compiler at the syntax level can determine that they are clean and return immutable objects are called syntactic values . So the restriction on values got its name - the automatic generalization of the right side of the let expression is limited to syntactic values . The F # language description contains a complete list of syntactic values, but our discussion gives an idea of what these expressions are - pure and returning immutable objects.

The problem we are solving here is not new - all compilers of ML family languages use a restriction on values in one form or another. A unique, in my opinion, feature of F # is the ability to bypass the restriction on values using explicit type annotations, and this is safe from the point of view of F # semantics.

When can this be useful? The classic example is lazy and lazy list . A typical definition of lazy (let's pretend that it is not in our language)

 type 'a LazyInner = Delayed of (unit -> 'a) | Value of 'a | Exception of exn type 'a Lazy = 'a LazyInner ref let create f = ref (Delayed f) let force (l : 'a Lazy) = ...

at first glance, full of side effects; the compiler does not know the contract between create and force . If we build the lazy list usual way using the lazy definition

 type 'a cell = Nil | CCons of 'a * 'a lazylist and 'a lazylist = 'a cell Lazy

and try to define an empty lazy list:

 let empty = create (fun () -> Nil)

restriction on values will not allow us to do this; however, the polymorphic use of a lazy list is completely legal; we can declare this by explicitly specifying the parameter of the polymorphic type:

 let empty<'T> : 'T lazylist = create (fun () -> Nil)

This is enough for the empty definition to compile, but if we try to use it:

 let l = empty

the compiler will again be indignant:

 File1.fs(12,5): error FS0030: Value restriction. The value 'l' has been inferred to have generic type val l : '_a lazylist Either define 'l' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.

In fact, the compiler knows that empty is a type function that does not undergo automatic generalization, since it does not belong to a set of syntactic values. F #, however, provides a loophole here - we can specify the [<GeneralizableValue>] attribute in the definition of empty :

 [<GeneralizableValue>] let empty<'T> : 'T lazylist = create (fun () -> Nil)

This forces the compiler to consider empty syntactic value, and the expression let l = empty compile.

In fact, sometimes definitions like our polymorphic v can be useful:

 let v<'T> : 'T list ref = ref []

If you are writing a function that is parameterizable by types and returning mutable objects or having side effects, specify the RequiresExplicitTypeArguments attribute:

 [<RequiresExplicitTypeArguments>] let v<'T> : 'T list ref = ref []

It completely corresponds to its name: now you cannot write v := [1] , only v<int> := [1] , and it will be clearer what is actually happening.

If you have felt all this, I hope that you now have a clear understanding of the restrictions on values in F #, and now you can control it with the help of explicit type annotations and the GeneralizableValue attribute if necessary. Along with force, however, comes responsibility: as correctly stated in an article on MSDN, these features are rarely used in everyday programming in F #. In my F # code, type functions appear only in cases similar to lazylist — the basic cases of data structures; In all other situations, I follow the tips from the article on MSDN:

Restrict the type so that it ceases to be polymorphic by adding an explicit type annotation to the value or parameter.
If the problem is to use a non-generic construct to define a polymorphic function, such as a composition of functions or partial application of the arguments of a curried function, try rewriting the definition of the function to ordinary.
If the problem is that the expression is too difficult to generalize, turn it into a function by adding an unused parameter.

Additional resources

For F #, there are many tutorials, including materials for those who come with C # or Java experience. The following links may be helpful as you learn more about F #:

Several other ways to get started with learning F # are also described.

Finally, the F # community is very friendly to beginners. There is a very active Slack chat, supported by the F # Software Foundation, with rooms for beginners that you can freely join . We strongly recommend that you do this!

Do not forget to visit the site of the Russian-speaking community F # ! If you have any questions about learning the language, we will be happy to discuss them in chat rooms:

Room #ru_general in Slack chat F # Software Foundation
chat in Telegram
chat in gitter

About authors of translation

Author translation @AnutaU .
Translation and editorial changes are made by the efforts of the Russian-speaking community of F # -developers . We also thank @schvepsss for preparing this article for publication.

Source: https://habr.com/ru/post/348460/

All Articles

Subtleties of the value restriction in F #

Additional resources

About authors of translation

More articles: