Introduction to OCaml: The Basics [1]

(Preface from the translator: I sat down to learn okam, found that there is no translation for the beginners in Russian. I fill this gap).

The basics

Comments

Comments in OCaml are indicated by the characters (* and *), like this:
(* *)

 (* This is a comment
    for a few
    lines.
 *)

In other words, comments in OCaml are very similar to comments in C ( /* ... */ ).

There are currently no one-line comments (like #... in Pearl or // ... in C99 / C ++ / Java). The possibility of using ## ... once discussed, and I highly recommend Okamlov's comrades to add this feature in the future (however, good editors open up the possibility of using single-line comments even now).
')
Comments in OCaml are nested; this makes it very easy to comment pieces of code with comments:

(* This code is broken ...

(* Primality test. *)
let is_prime n =
(* note to self: ask about this on the mailing lists *) XXX;;

*)

Function call

Suppose you have written a function, let's call it repeated, which takes the original string s, the number n and returns a new string consisting of n times the repeated string s.

In most C-like languages, the function call will look like this:
repeated ("hello", 3) /* this is C code */

This means “call the function repeated with two arguments, the first argument is the string hello, the second argument is the number 3”.

Like the rest of the functional programming languages, in OCaml, the recording of function calls and the use of parentheses are significantly different, which leads to many errors. Here is an example of the same call recorded on OCaml: repeated "hello" 3 (* this is OCaml code *) .

Please note - no brackets, no commas between arguments.

The expression repeated ("hello", 3) from the point of view of OCaml makes sense. It means “to call a function repeated with ONE argument, which is a“ pair ”structure consisting of two elements”. Of course, this will lead to an error, because the repeated function expects not one, but two arguments, and the first argument must be a string, not a pair. But, we will not go into particular details about pairs (tuples). Instead, just remember: using parentheses and commas when passing arguments to a function is an error.

Consider another function - prompt_string, which takes a string with the text of the invitation and returns user input. We want to transfer the result of this function to the function repeated. Here are the C and OCaml versions:

 / * C code: * /
 repeated (prompt_string ("Name please:"), 3)
 (* OCaml code: *)
 repeated (prompt_string "Name please:") 3

Take a closer look at the placement of the parentheses and the missing comma. In the version on OCaml, the first argument of function repeated is taken in brackets, which is the result of calling another function. The general rule: "brackets around the function call, and not around the function arguments." Here are some more examples:

 f 5 (g "hello") 3 (* f has three arguments, g has one *)
 f (g 3 4) (* f has one argument, g has two *)

 # repeated ("hello", 3) ;;  (* OCaml will give an error *)
 This expression has string

Definition of functions

Suppose you all know how to define functions (or static methods for Java) in familiar languages. But how do we do it in OCaml?

The OCaml syntax is elegant and concise. Here is a function that takes two floating point arguments and calculates the average:

 let average ab =
   (a +. b) /.  2.0 ;;

Set this at the top level (toplevel) of OCaml. (for this, in unix, just type the command ocaml). [Approx. per. for ubuntu / debian sudo aptitude install ocaml, for suse / centos / fedora - sudo yum install ocaml]. You will see:

 # let average ab =
   (a +. b) /.  2.0 ;;
 val average: float -> float -> float = <fun>

If you look at the definition of the function and what OCaml wrote, you will have a few questions:

What do the extra semicolons in the code do?
What does all this mean float -> float -> float ?

We will answer this question in the following sections, but for now I would like to define the same function in C (it would look similar in Java), and I hope to raise some more questions. Here is the C version of the same average function:

 double
 average (double a, double b)
 {
   return (a + b) / 2;
 }

Compare with a more compact version on OCaml. I hope you have questions:

Why we did not set the types of variables a and b in the version on OCaml? How did OCaml define the types? (and, in general, does OCaml know the types, or is it a fully dynamically typed language?)
In C, the number 2 is implicitly cast to double, can OCaml do the same?
How is the analogue of the operator return written to OCaml?

OK, here are some answers:

OCaml is a language with strong static typing (in other words, no dynamic casts, similar casts between int, float and string in Perl)
OCaml uses type inference to define types, so you don’t have to do it yourself. If you enter code at the top level of OCaml, as in the example above, OCaml reports type inference in your function.
OCaml does not implement any implicit type conversions. If you want a floating point number (float), then you must explicitly write 2.0 , because 2 is an integer [approx. Per.: In English, the fractional part is separated from the integer point, and the type is called, literally, “a floating-point number”, so in OCaml the point is used to separate the integer and fractional parts. OCaml does not automatically convert between types int, float, string or any other.
As a side effect of type inference in OCaml, functions (including operators) cannot be overloaded. OCaml defines + as the operation of addition of integers . To add floating point numbers use +. (attention to the point after the plus sign). Similarly, are used -. , *. , /. for other floating point operations.
OCaml has no return operator - the last expression in the function is used as the value of the function automatically.

We will discuss this in more detail in the following sections and chapters.

Main types

Type of	Value range
int	31-bit signed integer on 32-bit systems and 63-bit signed integer on systems with a 64-bit processor
float	A double precision floating point number (IEEE) is equivalent to double in C
bool	Boolean type, true / false values
char	8-bit character
string	Line

OCaml uses one of the int bits to store data for automatic memory management (garbage collection). That is why the int size is 31 bits, not 32 bits (63 bits for 64-bit systems). In normal use, this is not a problem, except for a few specific cases. For example, if you consider something in a loop, OCaml limits the number of iterations to 1 billion instead of 2. This is not a problem, because if you consider something close to the int limit in any language, you should use special modules for working with large numbers. (Nat and Big_int modules in OCaml). However, for processing 32-bit values (for example, a cryptocode or a network stack code), OCaml provides a nativeint type that corresponds to the bit nativeint integer on the platform.

OCaml does not have a basic type corresponding to an unsigned integer; however, you can get it using nativeint . As far as I know, OCaml does not have support for single-precision floating point numbers.

In OCaml, the char type used to represent text characters. Unfortunately, the char type does not support Unicode, neither in the form of multibyte encodings, nor in the form of UTF-8. This is a serious OCaml flaw that needs to be fixed, but for now there are extensive unicode libraries that should help.

Strings are not simply a sequence of bytes. They use their own, more efficient storage method.

The unit type is some similarity to the void type in C, but we'll talk about it later.

Explicit Typing vs. Implicit

In C-like languages, the whole is converted to a floating point number in some circumstances. For example, if you write 1 + 2.5 , then the first argument (integer) will be converted to a floating point, and the result will also be a floating point. This can be achieved by writing explicitly ((double)1)+2.5 .

OCaml never makes an implicit type conversion. OCaml 1 + 2.5 is a type error. The addition operator + requires two integer arguments and reports an error if one of the arguments is a floating-point number:

 # 1 + 2.5 ;;
       ^^^
 This expression has type float

(In a specific language “translation from French,” an error message means “you put a floating comma here, and I was expecting a whole”) [approx. Trans .: OCaml was developed by the French and the author makes fun of the unsuccessful translation of error messages from French to English].

To add two floating point numbers, use another operator, +. (pay attention to the point).

OCaml does not result in floating point integers, so this is also a mistake:

 # 1 +.  2.5 ;;
   ^
 This expression has type float

Now OCaml complains about the first argument.

What to do if you need to add an integer and a floating point number? (Let them be stored in the variables i and f ). In OCaml, you need to implement a direct type conversion:

 (float_of_int i) +.  f ;;

float_of_int is a function that accepts an integer and returns a floating point number. There is a whole bunch of such functions that perform such actions, called something like this: int_of_float , char_of_int , int_of_char, string_of_int . For the most part they do what is expected of them.

Since the conversion of int to float is very frequent, float_of_int has a short alias. The example above can be written as

 float i +.  f ;;

(Note that, unlike C, in OCaml, both the function and the type can have the same name.)

What is better - explicit or implicit coercion?

You might think that a clear cast is ugly, that this is a tedious task. In some ways, you are right, but there are at least two arguments in favor of an explicit cast. First, OCaml uses explicit type conversion to enable type inference (see below), and type inference is such a wonderful and time-saving feature that it obviously outweighs the extra button presses with explicit typing. Secondly, if you ever debugged C programs, you know that (a) implicit typing causes errors that are difficult to find, and (b) most of the time you sit and try to figure out where implicit typing worked. The requirement of explicit typing helps in debugging. Third, some type conversions (especially, the whole <-> floating point) are in reality very expensive operations. You do yourself a disservice by hiding them in implicit typing.

Regular and recursive functions

Unlike C-like languages, functions do not allow recursion unless it is explicitly indicated using a let rec expression instead of a regular let Here is an example of a recursive function:

 let rec range ab =
   if a> b then []
   else a :: range (a + 1) b
   ;;

Pay attention - range causes itself.

The only difference between let and let rec is the scope of the function name. If the function from the example above would be defined using just let , then the call of the range function would search for an existing (previously defined) function called range , rather than the now-defined function. Using let (without rec ) will allow you to override the value in the terminal of the previous definition. For example:

 let positive_sum ab = 
     let a = max a 0
     and b = max b 0 in
     a + b

The override hides the previous "binding" a and b from the function definition. In some situations, programmers prefer this approach to the use of new variables ( let a_pos = max a 0 ) as this makes the old bindings inaccessible, leaving only the newest values of a and b .

Defining functions via let rec does not give any performance changes compared to let , so if you like, you can always use the let rec form to get behavior similar to C-like languages.

Typing of function values

Due to type inference, you rarely have to explicitly specify the type of the return value of a function. However, OCaml often infers what it thinks about the return type of your functions, so you should know the syntax for such records. For the function f , which takes the arguments arg ₁ , arg ₂ , ... arg _n , and returns the value rettype compiler will print:

 f: arg1 -> arg2 -> ... -> argn -> rettype

The syntax using the arrow looks unusual, but then we come to the so-called derived functions (currying), you will understand why it is so. For now, here are some examples.

Our repeated function takes a string and an integer, returns a string. Her type is described as:

 repeated: string -> int -> string

Our average function, which takes two floating-point numbers and returns one floating-point number, is described as follows:

 average: float -> float -> float

The standard OCaml int_of_char function int_of_char :

 int_of_char: char -> int

If the function returns nothing ( void for C and Java), then we record that it returns the type unit . For example, on OCaml, an analogue of the fputs function looks like this:

 output_char: out_channel -> char -> unit

Polymorphic functions

Now a little more strange. What about a function that takes anything as an argument? Here is an example of an abnormal function that takes one argument but ignores it and always returns the number 3:

  let give_me_a_three x = 3 ;;

What is the type of this function? OCaml uses a special deputy, meaning "what your heart desires." This is a single quote followed by a letter. The function type above is written as:

 give_me_a_three: 'a -> int

where 'a in reality means "any type." For example, you can call this function as give_me_a_three "foo" or give_me_a_three 2.0 . Both options will be equally correct from the point of view of OCaml.

So far it is not very clear why polymorphic functions are useful, but in fact they are very useful and very common; we will discuss them later. (Hint: polyforism is something like patterns in C ++ or generic in Java).

Type inference

The theme of this tutorial is the idea that functional languages contain Many Cool Features and OCaml is a language that has all the Cool Features assembled in one place, making it very useful in practice for real programmers. But what is strange is that most of these useful features have nothing to do with "functional programming." Actually, I approached the first Cool Feature and I still have not said a word about why functional programming is called “functional”. In any case, here's the first cool feature: type inference.

Simply speaking, you do not need to declare types for your functions and variables, because OCaml will do it for you.

In addition, OCaml will do all the type checking for you, even between multiple files.

But OCaml is a practical language, and for this it contains a backdoor to the type management system, allowing you to bypass type checking in those rare cases where it makes sense. Most likely, a bypass of type checking is needed only by real OCaml gurus.

Let us return to the average function, which was introduced at the top level of OCaml.

 # let average ab =
   (a +. b) /.  2.0 ;;
 val average: float -> float -> float = <fun>

Mirabile dictu! OCaml did everything himself, determined that the function takes two float and returns float.

How? First, it is clear that a and b used in the expression (a + . b) . Since it is known that the function .+ Requires two floating point arguments, simple deduction can be inferred that

a b float .

, /. float, , , average , average float. :
average : float -> float -> float
, , , . , , , NullPointerException ClassCastException (, , , , Perl).

a b float .

, /. float, , , average , average float. :
average : float -> float -> float
, , , . , , , NullPointerException ClassCastException (, , , , Perl).

 a  b     float . 
 
 ,  /.   float,   ,   ,     average ,   average   float.         : 
 average : float -> float -> float 
  , ,    ,         .    ,       ,  , NullPointerException  ClassCastException    (,  ,   ,     ,  Perl).

a b float .

, /. float, , , average , average float. :
average : float -> float -> float
, , , . , , , NullPointerException ClassCastException (, , , , Perl).

Source: https://habr.com/ru/post/108529/

All Articles