📜 ⬆️ ⬇️

Introduction to OCaml: OCaml Program Structure [2]

[approx. Lane: continued translation, the first article here ]
teaser for an article with graphics on OCaml

OCaml program structure


Now we will spend a little time on the high-level analysis of some real OCaml programs. I would like to show you local and global definitions, the difference in use ;; and ; , modules, nested functions, links. Because of this, we will encounter many OCaml concepts that so far have no sense for a beginner to learn OCaml, since we have not met them before. Do not focus on them, focus instead on a general idea of ​​the format of the programs and language features that I will point to.

Local "variables" (actually local expressions)


Take our average function on C and add local variables to it (compare it with an example in the last chapter).

 double average (double a, double b)
 {
   double sum = a + b;
   return sum / 2;
 }

Now look at this for OCaml:
')
 let average ab =
   let sum = a +.  b in
   sum /.  2.0 ;;

The standard expression let name = expression in used to define a local named expression and name can be used in the future instead of expression up to ;; , which means the end of the local code block. Notice we didn’t even use the indent after the in declaration. Just take the let ... in as if it were a single statement.

Now, comparing local variables on C and the above local named expressions is a matter of manual dexterity. Although they differ in something. The variable sum in C code uses memory on the stack. You can later assign the sum any value you want, or even get a memory address where the value is stored. For the OCaml version this is not the case. In it, sum is just the short name of the expression a +. b a +. b . There is no way to assign anything to a sum . (A little later we will show how to create variables for which you can change the value).

Here is another example for final clarification. These two code snippets should return the same value ((a + b) + (a + b) 2 ):

 let fab =
   (a +. b) +.  (a +. b) ** 2.
   ;;

 let fab =
   let x = a +.  b in
   x +.  x ** 2.
   ;;

The second version should be faster in theory (but most compilers will be able to implement the stage called “destroying common subexpressions” for you), and it is definitely easier to read. x in the second example is just a short name for a +. b a +. b .

Global "variables" (actually global expressions)


You can also define global names for different things at the top level, and, like local “variables” above, they are not really variables, they are short names for different things. Here is a real (albeit slightly trimmed) example:

 let html =
   let content = read_whole_file file in
   GHtml.html_from_string content
   ;;

 let menu_bold () =
   match bold_button # active with
     true -> html # set_font_style ~ enable: [`BOLD] ()
   |  false -> html # set_font_style ~ disable: [`BOLD] ()
   ;;

 let main () =
   (* code omitted *)
   factory # add_item "Cut" ~ key: _X ~ callback: html # cut
   ;;


This is a piece of real code. html - a widget for editing HTML (a library object lablgtk), created at the beginning of the program with the expression let html = . Then, further, it is used in several functions.

Please note that the name html in the code snippet above should not be taken as a real global variable, as in C or other imperative languages. There is no memory allocation for "storing" a " html pointer." It is impossible to assign html something, for example, to redefine it as a pointer to another widget. In the next section, we will discuss links that are real variables.

let-binding


Any use of let ... , no matter where, at the top level (globally) or inside functions (locally), is often called let-binding (binding using let).

References: Real Variables


And what if you want a real variable, the value of which you can assign and change during the program? In this case, you need to use the link . Links are very similar to pointers in C / C ++. In Java, all variables that store objects are actually references (pointers) to objects. In Perl, links are links, just like in OCaml.

Here is how we create an int link in OCaml:

 ref 0 ;;

In fact, such an expression is not very useful. We created a link and due to the fact that we did not give her a name, the garbage collector immediately came and collected it! (In fact, most likely, it will be thrown out at the compilation stage). Let's give the link a name:

 let my_ref = ref 0 ;;

This link currently stores a whole zero. Let's now write another value into it (assignment operation):

 my_ref: = 100 ;;

And let's see what the link stores now:

 #! my_ref ;;
 -: int = 100

So, the operator := used to assign links, and the operator ! dereference links returning content. Here is a rough but effective comparison with C / C ++:
OcamlC / C ++
 let my_ref = ref 0 ;;
 my_ref: = 100 ;;  
 ! my_ref

 int a = 0;  int * my_ptr = & a;
 * my_ptr = 100;
 * my_ptr


The links have their own scope, but you will not use them too often. The use of let name=expression in to name local expressions in your functions will be much more frequent.

Nested functions


There is no concept of nested functions in C. GCC supports nested functions for C programs, but I don’t know of any programs that would use this extension. In any case, this is what gcc's info page writes about nested functions:
A nested function is a function defined inside another function (Nested functions are not supported for GNU C ++). The names of nested functions are local to the block in which they were defined. For example, here is the definition of the nested function 'square', which is called twice:

 foo (double a, double b)
 {
   double square (double z) {return z * z;  }

   return square (a) + square (b);
 }

The nested function has access to all the functions of the external function that are visible at the time of the function definition. This is the so-called "area of ​​lexical visibility." Here is an example of a nested function that inherits a variable called offset:

 bar (int * array, int offset, int size)
 {
   int access (int * array, int index)
     {return array [index + offset];  }
   int i;
   / * ... * /
   for (i = 0; i <size; i ++)
     / * ... * / access (array, i) / * ... * /
 }


I hope the idea is clear. Nested functions, however, are very useful and are actively used in OCaml. Here is an example of using a nested function from a real code:

 let read_whole_channel chan =
   let buf = Buffer.create 4096 in
   let rec loop () =
     let newline = input_line chan in
     Buffer.add_string buf newline;
     Buffer.add_char buf '\ n';
     loop ()
   in
   try
     loop ()
   with
     End_of_file -> Buffer.contents buf ;;

Don't worry yet, if you don't understand this code entirely - it contains many concepts that we haven’t talked about yet. Instead, concentrate on the central nested loop function, which takes an argument of type unit . You can call loop () from the read_whole_channel function, and it is not defined outside the function. The nested function has access to the variables defined in the main function (so the loop gets access to the local variables buf and chan ).

The form for defining a nested function is similar to the task of a local named expression: let name arguments = function-defenition in.

Usually the function definition is indented on a new line (as in the example above). And remember to use let rec instead of let if the function is recursive (as in the example above).

Modules and open command


OCaml comes with many interesting modules (libraries with useful code). For example, there are standard libraries for drawing graphs, creating GUI interfaces using a set of widgets, handling large numbers, data structures, and making POSIX system calls. These libraries are located in / usr / lib / ocaml / VERSION (on Unix). For our examples, we will focus on a fairly simple module called Graphics .

The Graphics module consists of 5 files (on my system):

 /usr/lib/ocaml/3.08/graphics.a
 /usr/lib/ocaml/3.08/graphics.cma
 /usr/lib/ocaml/3.08/graphics.cmi
 /usr/lib/ocaml/3.08/graphics.cmxa
 /usr/lib/ocaml/3.08/graphics.mli

[approx. Trans .: and on my system (Debian Sid) the modules are dumped directly into / usr / lib / ocaml, without specifying the version].

First, we concentrate on the file graphics.mli . This is a text file, so you can easily see its contents. First of all, note that the name is graphics.mli , not Graphics.mli . OCaml always makes the first letter of the file name capitalized when it comes to modules. This can be quite confusing, unless you know about it beforehand.

If we want to use functions in Graphics , then there are two ways. Or at the beginning of our program we write the open Graphics;; declaration open Graphics;; or we supplement all calls to the corresponding functions with the prefix: Graphics.open_graph.open . open slightly resembles the import function in Java, and slightly more (recalls) the use expression in Perl.

[For Windows users: In order for this example to work interactively in Windows, you will need to create a separate top level (toplevel). Run a command from the command line like this: ocamlmktop -o ocaml-graphics graphics.]

A couple of the following examples should clarify everything (These two examples paint different things - try them both). Notice that the first example calls open_graph , the second one calls Graphics.open_graph . [approx. Trans .: At the beginning of the article screenshots of what programs do].

 (* To compile this example: ocamlc graphics.cma grtest1.ml -o grtest1 *)

 open Graphics ;;

 open_graph "640x480" ;;
 for i = 12 downto 1 do
   let radius = i * 20 in
   set_color (if (i mod 2) = 0 then red else yellow);
   fill_circle 320 240 radius
 done ;;
 read_line () ;;




 (* To compile this example: ocamlc graphics.cma grtest2.ml -o grtest2 *)

 Random.self_init () ;;
 Graphics.open_graph "640x480" ;;

 let rec iterate r x_init i =
         if i = 1 then x_init
         else
                 let x = iterate r x_init (i-1) in
                 r *.  x *.  (1.0 -. X) ;;

 for x = 0 to 639 do
         let r = 4.0 *.  (float_of_int x) /.  640.0 in
         for i = 0 to 39 do
                 let x_init = Random.float 1.0 in
                 let x_final = iterate r x_init 500 in
                 let y = int_of_float (x_final *. 480.) in
                 Graphics.plot xy
         done
 done ;;

 read_line () ;;

Both examples use some language features that we haven’t talked about yet: imperative for loops, if-then-else blocks, recursion. We will discuss this later. Regardless, you can still: (1) try to understand how they work (2) how type inference allows you to catch errors.

Pervasives module


There is one module that does not need "open". This is the Pervasives module (located in /usr/lib/ocaml/3.08/pervasives.mli [note: I have /usr/lib/ocaml/pervasives.mli]). All characters from the Pervasives module are automatically imported into all OCaml programs.

Rename modules


What if you want to use symbols from Graphics , but don't want to import them all, and you are lazy to type Graphics every time? Just rename the module using this trick:

 module Gr = Graphics ;;

 Gr.open_graph "640x480" ;;
 Gr.fill_circle 320 240 240 ;;
 read_line () ;;

In fact, this technique is very useful if you want to import a nested module (modules can be nested one into another), but you do not want to print the path to the nested module each time.

When to use and when to skip ;; and ;


Now we consider a very important question. When to use ;; when to use ; and when should you skip both options? This tricky question remains as long as you understand it well. He was worried about the author for a long time while he was studying OCaml.

Rule number 1 - you must use ;; to separate statements in the top level of your code (top-level) and never inside the definition of functions or any other kind of statements.

Let's look at a fragment when from the second example with graphics:

 Random.self_init () ;;
 Graphics.open_graph "640x480" ;;

 let rec iterate r x_init i =
         if i = 1 then x_init
         else
                 let x = iterate r x_init (i-1) in
                 r *.  x *.  (1.0 -. X) ;;


We have two statements at the top level and the definition of the iterate function. Each of them ends ;; .

Rule number 2 - sometimes you can skip ;; . As a neophyte, you should not worry too much about this rule and always write ;; , as prescribed by rule number 1. But if you read someone else's code, you will occasionally encounter the absence ;; . Places to drop ;; :

Here is an example code where ;; omitted where possible:

 open Random (* ;; *)
 open Graphics ;;

 self_init () ;;
 open_graph "640x480" (* ;; *)

 let rec iterate r x_init i =
         if i = 1 then x_init
         else
                 let x = iterate r x_init (i-1) in
                 r *.  x *.  (1.0 -. X) ;;

 for x = 0 to 639 do
         let r = 4.0 *.  (float_of_int x) /.  640.0 in
         for i = 0 to 39 do
                 let x_init = Random.float 1.0 in
                 let x_final = iterate r x_init 500 in
                 let y = int_of_float (x_final *. 480.) in
                 Graphics.plot xy
         done
 done ;;

 read_line () (* ;; *)

Rules No. 3 and No. 4 apply to ; . They are completely different from the rules for ;; . A single semicolon (;) is called a sequence point . Trans .: I can be wrong with the translation of the sequence point], which plays exactly the same role as a single semicolon in C, C ++, Java or Pearl. It means "do everything after this place when you do everything before it." I bet you didn't know that.

Rule number 3: Consider let ... in statement and never put ; after him.

Rule # 4: Complete all other statements in the code block ; except the last one.

The internal for loop above is a good example. Please note that we have never used a single ; in the code.

         for i = 0 to 39 do
                 let x_init = Random.float 1.0 in
                 let x_final = iterate r x_init 500 in
                 let y = int_of_float (x_final *. 480.) in
                 Graphics.plot xy
         done

The only place where use is possible ; - this is a line of Graphics.plot xy , but since this is the last line of the block, according to rule No. 4, it is not necessary to install it.

Note regarding ";"


Brian Hart corrected me:
; - the same operator as, for example, the addition operator ( + ). Well, not quite as +, but in fact - exactly. + has the type int -> int -> int - it takes two integers and returns an integer (their sum). ; has type unit -> 'b -> 'b - it takes two values ​​and simply returns the second. Unlike the comma in C. You can write a;b;c;d as easy as you can write a + b + c + d .

This is one of those fundamental concepts, an understanding of which gives an understanding of the language, but which never really speaks out loud — literally everything in OCaml is an expression. if/then/else is an expression. a;b is an expression. match foo with ... is an expression. The code below is absolutely correct (and they all do the same thing):

  let fxby = if b then x + y else x + 0
 
  let fxby = x + (if b then y else 0)
 
  let fxby = x + (match b with true -> y | false -> 0)
 
  let fxby = x + (let gz = function true -> z | false -> 0 in gyb)
 
  let fxby = x + (let _ = y + 3 in (); if b then y else 0)

Especially look at the last one - I use ; as an operator for “combining two statements. All functions in OCaml can be expressed as:
  let name [parameters] = expression

The definition of "expression" in OCaml is somewhat broader than in C. In fact, C has the concept of "statements", but all the statements in C are just expressions in OCaml (combined ; ).

The only difference between ; and + is the ability to reference + as a function. For example, I can define the function sum_list to sum up lists of integers as:
 let sum_list = List.fold_left (+) 0



All together: some real code


In this section, we will show some real code snippets from the labgtk 1.2 library (Labgtk is the interface in OCaml for native Unix widgets). Warning: this code is full of things that we haven’t talked about yet. Do not go into details, look instead at the general structure of the code, the place where the authors used ;; where they used ; where they used open , how they beat off text, how they used local and global expressions.

... However, I will give you a few tips to not get lost at all.



The first fragment: The programmer opens a pack of standard libraries (omitting ;; because the following keywords are open or let). It also creates a function called file_dialog. Inside this function, it defines a named expression called sel using the two-line statement let sel = .. in . It then calls several methods for sel.

 (* First snippet *)
 open StdLabels
 open GMain

 let file_dialog ~ title ~ callback? filename () =
   let sel =
     GWindow.file_selection ~ title ~ modal: true? Filename () in
   sel # cancel_button # connect # clicked ~ callback: sel # destroy;
   sel # ok_button # connect # clicked ~ callback: do_ok;
   sel # show ()

The second fragment: Just a large list of global names at the top level. Note that the author has omitted everything ;; according to rule number 2.

 (* Second snippet *)

 let window = GWindow.window ~ width: 500 ~ height: 300 ~ title: "editor" ()
 let vbox = GPack.vbox ~ packing: window # add ()

 let menubar = GMenu.menu_bar ~ packing: vbox # pack ()
 let factory = new GMenu.factory menubar
 let accel_group = factory # accel_group
 let file_menu = factory # add_submenu "File"
 let edit_menu = factory # add_submenu "Edit"

 let hbox = GPack.hbox ~ packing: vbox # add ()
 let editor = new editor ~ packing: hbox # add ()
 let scrollbar = GRange.scrollbar `VERTICAL ~ packing: hbox # pack ()

The third fragment: The author imports all the characters from the GdkKesyms module. Then comes the unusual let-binding . let _ = expression means “calculate the value of the expression (including the execution of all side effects), but throw out the result”. In this case, “calculate the value of an expression” means executing Main.main () , which is the main GTK loop, a side effect of which is the appearance of a window on the screen and the execution of the entire application. The “result” of the call to Main.main () significant. Most likely this unit , but I did not check - and it just does not return until the application is completed.

Note that this fragment contains long chains, in fact, procedural commands. This is a real classic imperative program.

 (* Third snippet *)

 open GdkKeysyms

 let _ =
   window # connect # destroy ~ callback: Main.quit;
   let factory = new GMenu.factory file_menu ~ accel_group in
   factory # add_item "Open ..." ~ key: _O ~ callback: editor # open_file;
   factory # add_item "Save" ~ key: _S ~ callback: editor # save_file;
   factory # add_item "Save as ..." ~ callback: editor # save_dialog;
   factory # add_separator ();
   factory # add_item "Quit" ~ key: _Q ~ callback: window # destroy;
   let factory = new GMenu.factory edit_menu ~ accel_group in
   factory # add_item "Copy" ~ key: _C ~ callback: editor # text # copy_clipboard;
   factory # add_item "Cut" ~ key: _X ~ callback: editor # text # cut_clipboard;
   factory # add_item "Paste" ~ key: _V ~ callback: editor # text # paste_clipboard;
   factory # add_separator ();
   factory # add_check_item "Word wrap" ~ active: false
     ~ callback: editor # text # set_word_wrap;
   factory # add_check_item "Read only" ~ active: false
     ~ callback: (fun b -> editor # text # set_editable (not b));
   window # add_accel_group accel_group;
   editor # text # event # connect # button_press
     ~ callback: (fun ev ->
       let button = GdkEvent.Button.button ev in
       if button = 3 then begin
         file_menu # popup ~ button ~ time: (GdkEvent.Button.time ev);  true
       end else false);
   editor # text # set_vadjustment scrollbar # adjustment;
   window # show ();
   Main.main ()


[approx. Per.: if someone sees errors or rekkryaki, write, correct]

Source: https://habr.com/ru/post/108532/


All Articles