Amazing country Oz, or how to accept data using send

For a long time, collecting information on the means of parallel programming, I came across an elegant (in other words difficult to describe sensations) language Oz http://www.mozart-oz.org . The language then seemed to me worthy of introducing it to the Habra community. And so, I had time and reasons to do it.

Oz is a multi-paradigm programming language. A set of basic abstractions in the language is unusual and allows, for example, to write the send procedure information so that with its help it will be possible to receive data as well. And without any trick like:

send(socket; buffer; flag) = (if (flag == RECV) (recv(socket; buffer)) or (realsend(socket; buffer))) .
')
We are talking about the fact that sending and receiving data is carried out by the same sequence of operations of the virtual machine Oz. Naturally, this is achieved through special abstractions for working with data and with parallel processes. This text is devoted to the description of these abstractions, because in my opinion, they let you feel the characteristics of Oz quite well. Of course, Oz is more than what is stated below, but, as it seems to me, the secret of the cunning send is the material suitable for the first acquaintance with this language and for receiving pleasure from it.

0. Introduction

You can start with the text of the program that implements the construction of the Port , through which data is transmitted using the send procedure. This program also includes an example of using Port . Source code (proper name in this text):

 
	 declare Port in 
	 
	 local PortTag NewPort IsPort Send in 
		 PortTag = {NewName} 
	 
		 fun {NewPort FS} 
			 local SC in 
				 C = {NewCell S} 
				 FS = !! S 
				 {NewChunk port (PortTag: C)} 
			 end 
		 end 
	 
		 fun {IsPort? P} 
			 {Chunk.hasFeature P PortTag} 
		 end 
	 
		 proc {Send PM} 
			 local Ms Mr in 
				 {Exchange P.PortTag Ms Mr} 
				 Ms = M |  !! Mr 
			 end 
		 end 
	 
		 Port = port 
		 ( 
			 new: NewPort 
			 is: IsPort 
			 send: send 
		 ) 
	 end 
	 
	 declare NewQueueServer in 
	 
	 fun {NewQueueServer} 
		 local Given GivePort Taken TakePort Join in 
			 GivePort = {Port.new Given} 
			 TakePort = {Port.new Taken} 
	 
			 proc {Join Xs Ys} 
				 local Xr Yr XY in 
					 Xs = X |  Xr 
					 Ys = Y |  Yr 
					 X = y 
					 {Join Xr Yr} 
				 end 
			 end 
	 
			 thread {Join Given Taken} end 
		 
			 queue 
			 ( 
				 put: proc {$ X} {Port.send GivePort X} end 
				 get: proc {$ X} {Port.send TakePort X} end 
			 ) 
		 end 
	 end 
	 
	 declare Q in 
	 
	 thread Q = {NewQueueServer} end 
	 
	 {Q.put 1} 
	 {Browse {Q.get $}} 
	 {Browse {Q.get $}} 
	 {Q.put 2}

As you can see, there is no trick. Both queueserver functions — put and get — are implemented using the Send procedure, the semantics of which corresponds to its name. This code could be shorter - Oz has enough syntactic sugar, which is not used above in order to reduce the amount of necessary explanations.

To understand how this code works, you need to understand the (1) model of program execution in Oz: variables, threads, alignment operator ( = ); (2) procedures; (3) cells (cells), memory areas (chunks) and future (futures).

1. Variables, threads, operator `=`

As mentioned, Oz is a multi-paradigm language. In Wikipedia, it is described as functional, logical, imperative, object-oriented, distributed, parallel (in the sense of concurrent) and as a language for programming with constraints. But it is based on another paradigm - dataflow programming. Oz implements this paradigm close enough to the original (used in CPU architectures) concept of computing with data streams, and does it as follows.

1.1. All calculations in Oz occur with the data contained in the store. The storage is a rather abstract thing (the internal mechanics are hidden from the programmer, and the information in the store can be managed only through a certain interface) and distributed (from the store, all participants of Oz-calculations can work, including scattered across the network). Store can store information in three different types. (1) In the form of free or related logical variables. (2) In the form of cells (cells), which are, rather, not memory cells, but named variable pointers to variables. (3) In the form of procedures that are named closures in Oz (naturally, they are first-order objects, that is, they can be explicitly manipulated in the program).

1.2. Calculation occurs when the thread (thread). In this case, the thread in Oz - this is the dataflow. That is, a certain sequence of statements (statements) executed one after another (all as in imperative languages), but with one reservation: the statement is executed only when all the variables used in it turn out to be related. Otherwise, the thread pauses and waits for the moment of binding the desired variables. Actually, the execution of a certain operator after the readiness of the data used in it is the main method of dateflow calculations.

Threads are created in Oz very simply, using the thread ... end construct. Creation of this occurs in the fork() style. That is, the generated thread has access to all the variables (namely variables) that were available to the parent thread at the time of the thread ... end operation.

1.3. Variables in Oz are logical.

A logical variable in the Oz style is a once-linked variable that can be equated with another variable.

In general, this definition (translation from the manual) says little about variables in Oz. They allow you to do a little more with yourself. And they differ significantly from the usual variables in c / javascript / haskell, which are either named memory cells (c / js) or named values (haskell). Probably the best idea about the variables in Oz is a description of some technical details.

1.3.1. The life path of a variable in Oz begins with the fact that it is declared (introduce) with the help of constructions local ... in ... end or declare ... in ... The differences between these constructions are that local ... allows declaring variables only for use in operations that are inside in ... end , and variables declared using declare will be available everywhere after the corresponding in . Naturally, declare can be used to define variables only in the global scope of a module.

The declaration of a variable is that a new node is created in the store (node) referenced by the declared variable. And the value unknown written to the node, indicating that the variable is in no way associated with the data. If a thread accesses such a variable, it will be suspended until the corresponding node is associated with the data. For example (code-1):

 
	 local X in 
		 thread {Browse 2 * X} end 
		 X = 5 
	 end

Browse is one of the standard Oz procedures that prints its argument. When X declared in this example, a variable will be created and a new node in the store corresponding to it. When Browse accesses this node through X , the execution of the corresponding thread will be suspended until the value appears in the corresponding X node (the value in the node itself and the variable binding to the node may change here).

1.3.2. The values in the nodes are placed in the process of unification (unification) or (another name) incremental tell. Unification for a pair of expressions is performed by the operator = (equality operator). In the same process, the binding of variables to nodes may change. The unification algorithm for a pair of expressions (specifically for expressions) in the code E1 = E2 works like this (a recursive procedure with an analysis of the variants of what E1 and E2 can be).

U.1. The results of the calculations of E1 and E2 are the values of the atomic types Oz (atomic types are indivisible types: integers, floating point numbers, literals - strings, for example). In this case, if these values are equal, the operator is executed, and if inequalities are generated, an exception is generated (exceptions in Oz are fairly standard and their processing is specified using the try ... catch ... finally ... end ; finally ; optionally).

U.2. The result of the calculation of E1 is some variable, and the result of the calculation of E2 is a value of some type (symmetric to this situation is the one in which the calculation of E1 leads to a value of a certain type, and E2 to a variable). It can disappear like this:

 
	 X = (Y + Z) * 5

In this case, everything works as follows.

U.2.1. The variable to be set by E1 refers to unknown node. If so, then the value obtained by the calculation of E2 written to this node, as a result of which the E1 variable is bound.

U.2.2. If the E1 variable is already associated with an atomic value, then the value that is in the corresponding node is compared with the E2 value, and if the types do not match or the values themselves, an exception is generated. Otherwise, the statement ends.

It is according to this rule that the variable X linked from code -1. After it is associated with the value (5), the statement inside the called procedure Browse is executed, waiting for this binding. And here it should be clear that the result would be exactly the same if code-1 looked like this:

 
	 local X in 
		 thread {Browse 2 * X} end 
		 5 = X 
	 end

In U.2. Another option is possible when the E1 variable is associated with the value of a composite type, but it will be discussed later in paragraph U.4 ...

U.3. Both E1 and E2 define some variable. Here, too, options are possible.

U.3.1. One of the variables is unbound, or both variables are unbound (that is, they refer to nodes with a value of unknown). In this case, the node of one of the unrelated variables or the unknown-node of the only one that is not connected is discarded. Then, the variable previously referenced to this node is modified so that it starts referring to the same node as the other variable.

Here it should be clear why, by executing code-2:

 
	 local XYZ in 
		 thread {Browse X + Y + Z} end 
		 X = y 
	 % Z = X + YX = 10 
		 X + Y = Z 
	 end

Oz will give the value 40. And why does Oz "hang" if you remove the comment (starts with % ).

U.3.2. Both variables refer to nodes in which values are written. In this case, the behavior is as follows. If these are values of different types, or if they are different atomic values of the same type, an exception will be generated. If these are equal values of some atomic type, then the operator will complete its execution. But nodes can also store values of composite types.

The main composite type in Oz is record. Syntactically, the entries look like this:

 
	 label (feature0: field0 feature2: field2 ... featureN: fieldN)

Approximately, because the above is a closed record, with a fixed number of fields (field), and the records are also open. Record fields can be accessed through a dot, by the name of the corresponding property (feature). The number of fields in a structure is called its arity. More specific example:

 
	 U = habrauser (nick: 'mikhanoid' karma: 10 strength: 10) 
	 K = U.karma

Subtlety: the fields in the record type are themselves Oz variables. In the example above, when creating a value with a record type and a habrauser tag, three variables are created that were unified according to the described algorithm with values of atomic types: 'mikhanoid', 10 and 10. But instead of such values, the unification of variable fields can be performed with any expressions. At this point, it should be clear how this code works:

 
	 local UKS in 
 (*) U = habrauser (nick: 'mikhanoid' karma: K strength: S) 
		 K = S 
		 10 = K 
	 end

In line (*), the unlinked variable U unified with the value of the record type according to rule U.2.1.

Oz records have an important role - they turn the store into a digraph structure: if a node has a value of record type, then with its variable fields it points to other nodes, that is, fields are something like arcs marked with property names. And the described algorithm, thanks to clause U.4., Can be considered as a graph merge algorithm.

U.4. Here you can combine the following cases: (1) both expressions E1 and E2 are record type values, (2) both expressions define variables that refer to nodes in which records are stored, (3) one of E1 , E2 sets a record type value, and another - sets the variable associated with the node in which the record is stored. In all these cases, if the records have (a) different labels, or (b) different arity, or different sets of properties, an exception is generated. If the records coincide in points (a), (b) and ©, then unification of pairs of variables from different records with the same property names occurs.

At this point, it should already be clear why as a result of the work of such code:

 
	 local WXYZ in 
		 W = XZ = Y 
		 f (a: 10 b: X) = f (a: Y b: 20) 
		 {Browse W + Z} 
	 end

the value 30 will be displayed.

Another interesting example is this code:

 
	 local Z in 
		 Z = f (Z 20) 
		 {Browse Z} 
	 end

Here f is a variant of the record in which the field properties automatically get integer names from 1 to its arity. In Oz, such records are called tuples and are analogs of compound expressions (compound term) in logic programming. In the code above, the following happens. (1) creates a value of type record with two fields, that is, two variables are created. (2) the field with property 1 unified with the unbound variable Z, the second field with an atomic value of 20. In the end, the value of f points to the unknown-node pointed to by both the variable Z and the node storing the value 20. (3) to unknown -node to which Z points a value of type entry, the second element of which (like the variable Z) points to this node. Nothing special, just get a cycle in the column store. Browse prints it like this: R10 = f(R10 20) .

Lists in Oz are a variety of tuples. They are organized as standard for functional and logical programming: the list is a tuple, the first element is the head — of which is variable, and the second is the list that constitutes the tail. Mark the head at the tail of the list, you can use the symbol | : Head | Tail Head | Tail

2. Procedures

Oz allows you to abstract operator sequences using procedures. Procedures in Oz are values (first-class objects), so you can freely 'assign' them to variables. In more detail. Oz stores closures that implement procedures in the store in the form of some code. Each such closure procedure gets a unique name for the entire store. Names in Oz are literals - values of atomic type. After creating the procedure code and assigning a unique name to it, the name is written to the node referenced by some variable. After that, the procedure can be called through this variable.

That is why in the source code NewPort , IsPort , Send , ... are declared as variables using local or declare .

The basic design defining the procedure is proc :

 
	 local ... P ... in 
		 ... 
		 proc {P X1 ... XN} S1 ... SM end 
		 ... 
	 end

X1 , ..., XN are formal arguments. S1 , ..., SM is a sequence of statements implementing the procedure. Procedures are invoked using braces:

 
	 {P A1 ... AN}

A1 , ..., AN are valid parameters: expressions, variables, and so on. The call occurs with the declaration of new variables corresponding to the formal parameters, which are unified with the arguments, among which may be unrelated variables. Therefore, the procedure can both receive and return data through any parameter. Therefore, to increase the readability of the code, variables that the programmer intends to use only for returning values can be marked with a prefix ? which is just a comment.

A variable associated with a node that stores a procedure name can also be the result of evaluating an expression, like here, for example:

 
	 {Q.put 1}

In Oz, it is possible to work with anonymous procedures, but it is implemented in an unusual way — through the nesting mechanism. In the sequence of operators enclosed in {...} one of the positions can be occupied by the $ symbol - an attachment label. When Oz encounters such labels inside curly brackets, executing some operator, it (1) automatically creates new variables, according to the number {... $ ...} , in the new local scope, then, in this scope (2) makes calls {... $ ...} , replacing the marker by substituting a new variable created for this call, and (3) inserts these variables into the {... $ ...} places in the executable statement. for example

 
	 local P in 
		 proc {PX? Y} Y = X + 10 end 
		 {Browse {P 20 $} + {P 30 $} + 40} 
	 end

will be executed as

 
	 local P in 
		 proc {PXY} Y = X + 10 end 
		 local X1 X2 in 
			 {P 20 X1} 
			 {P 30 X2} 
			 {Browse X1 + X2 + 40} 
		 end 
	 end

The nesting tag can also be used in the first place inside {...} , but only during the procedure definition. But the investment mechanism will work the same way. Codes:

 
	 local P in 
		 local P1 in 
			 proc {P1 X1 ... XN} S1 ... SM end 
			 P = P1 
		 end 
	 end

and

 
	 local P in 
		 P = proc {$ X1 ... XN} S1 ... SM end 
	 end

are equivalent.

Oz functions are also procedures, and are a certain simplification of the syntax for those cases where it is known that the procedure should exactly return some value. Then you can save on the description of one formal parameter, namely: fun {F X1 ... XN} S1 ... SM end the same as proc {F X1 ... XN Y} S1 ... Y = SM end . Calling {F A1 ... AN} , as a function, Oz automatically turns into a procedure call {F A1 ... AN $}

For example.

 
	 fun {FX} 
		 local K in 
			 K = X / 20 
			 {Browse K} 
		 end 
		 local L in 
			 L = 30 
			 X == L 
		 end 
	 end

meets the definition

 
	 proc {FXY} 
		 local K in 
			 K = X / 20 
			 {Browse K} 
		 end 
		 
		 Y = local L in 
			 L = 30 
			 X == L 
		 end 
	 end

The value of the local ... in ... end block is the value of the last operator in it. That is, in this example, F will be the result of its call to have the answer to the question: is the first and only argument of function 30 equal?

3. Cells (cell), memory areas (chunk) and future (future)

To work with data in store Oz supports a few more abstractions.

3.1. Memory areas are some entries in the store. But unlike conventional recordings, they are identified using unique names (just like procedures) and do not allow arity to be detected. That is, their components can be accessed only by the names of properties (feature). If these names are hidden from some parts of the code, then in these areas, access to the elements of the corresponding memory area will be impossible. Regions are created by calling the function {NewChunk Record} . The call creates a memory region with a unique name, and returns this name. The name can be written to the node referenced by some variable. Then through this variable and the operator . You can refer to the fields of the memory area:

 
	 local XR in 
		 R = f (a: 1 b: 2 c: 3) 
		 X = {NewChunk R} 
		 {Browse Xc} 
	 end

3.2. Cells in Oz are designed to work with states. A cell, like procedures and memory areas, in a store is defined by some unique name. Like a variable, it is a pointer to a certain node, but unlike a variable, nodes that a cell points to can be explicitly and repeatedly set. The cell identifies the next interface.

C = {NewCell E}- creates a cell pointing to the node that will result from the unification of the value of the expression Eand the variable - the formal argument of the procedure NewCell. The new unique name of the newly created cell is returned and assigned to the variable C. The link to the node from the cell, of course, is taken into account when garbage collection. {IsCell C}answers the question: is the variable associated Cwith the cell name?

@C- this expression has the result of its own variable, pointing to the node on which the cell with the name stored in the variable refers C. C := Echanges the pointer in the cell whose name is stored in C, so that it points to the node obtained as a result of the unifications of some unrelated nameless variable and the result of the expression E.

{Exchange C E1 E2}- for one inseparable, atomic operation unifies @Cwith E1and, then, performs C := E2.

A cell, for example, can be used as a counter:

 
	local C in 
		C = {NewCell 0} 
		{Exchange CX thread X + 1 end} 
	 end

It can not be excluded thread ... end. And it works for the reason that for each running thread, Oz automatically creates an unbound variable that is unified with the last of the thread body operators. And this variable in this example is the last argument Exchange.

3.3.The future was first proposed in Smalltalk-72, and then the idea of using them was developed in MultiLISP (1985). Future fairly common tool: they are in Java, accessible via java.util.cuncurrent.Future; their support is scheduled in C ++ 0x. They are supported in Oz, but in an unusual form.

The future for some expression Eis an object associated with asynchronous computing E. The future allows you to perform some operations with a result that Eis not yet computed — for example, use it in function calls, write to lists, transfer to other asynchronous calculations, and so on. In the case when the continuation of work requires exactly the value of the expression E, the reference to its future will suspend execution until the moment when it Eis calculated.

In Oz, ordinary variables have a similar property: some actions with them can be performed without waiting for the values to appear in the nodes to which they refer. Many of the examples already given illustrate this. However, variables in Oz are bidirectional information exchange channels: those to the left and right of the =expression are symmetrical for the unification algorithm. Therefore, in the code:

 
	local X in 
		thread X = 5 end 
		thread X = 7 end 
		X = 3 
		{Browse X} 
	 end

none of the strings stands out as producing data. An exception in the process of unification can cause any thread. The future allows to establish some order in this situation. In Oz, the future is a read-only link to a node, that is, a somewhat limited version of the variable. If the future is involved in the process of unification, then this process is suspended until the corresponding future variable is bound. The future for the variable Xin Oz is formed by the operator !!X. The order in the example above can be made, for example, as follows:

 
	local XY in 
		Y = !! X 
		thread Y = 5 end 
		thread X = 7 end 
		Y = 3 
		{Browse Y} 
	 end

here the value will always be formed only in the second thread.

4. All together

It is convenient to bring the code from the introduction again so that you do not need to scroll through this text to the beginning:

 
00 declare Port in 
 01	 
02 local PortTag NewPort IsPort Send in 
03 PortTag = {NewName} 
 04	 
05 fun {NewPort FS} 
06 local SC in 
07 C = {NewCell S} 
08 FS = !!S 
09 {NewChunk port(PortTag: C)} 
10 end 
11 end 
 12	 
13 fun {IsPort ?P} 
14 {Chunk.hasFeature P PortTag} 
15 end 
 sixteen	 
17 proc {Send PM} 
18 local Ms Mr in 
19 {Exchange P.PortTag Ms Mr} 
20 Ms = M | !!Mr 
21 end 
22 end 
 23	 
24 Port = port 
25 ( 
26 new: NewPort 
27 is: IsPort 
28 send: Send 
29 ) 
30 end 
 31	 
32 declare NewQueueServer in 
 33	 
34 fun {NewQueueServer} 
35 local Given GivePort Taken TakePort Join in 
36 GivePort = {Port.new Given} 
37 TakePort = {Port.new Taken} 
 38	 
39 proc {Join Xs Ys} 
40 local Xr Yr XY in 
41 Xs = X | Xr 
42 Ys = Y | Yr 
43 X = Y 
44 {Join Xr Yr} 
45 end 
46 end 
 47	 
48 thread {Join Given Taken} end 
 49		 
50 queue 
51 ( 
52 put: proc {$ X} {Port.send GivePort X} end 
53 get: proc {$ X} {Port.send TakePort X} end 
54) 
55 end 
56 end 
 57	 
58 declare Q in 
 59	 
60 thread Q = {NewQueueServer} end 
 61	 
62 {Q.put 1} 
63 {Browse {Q.get $}} 
64 {Browse {Q.get $}} 
65 {Q.put 2}

It makes no sense to describe the entire program line by line (which rarely gives an idea of the work of even non-parallel programs), but you need to clarify a few points.

4.1.Creation Port.

Portis a record, the fields are associated with a set of procedures. This is nothing more than a convenient grouping of several procedures. The port itself is a memory area containing a cell under the name, whose uniqueness is guaranteed by the function NewName. The variable PortNameis visible only in the local unit (02-30), so access to the components of the memory area can only be obtained from the procedures IsPort, NewPort, Send. Outside this block, PortNamenothing is known about the variable , so access to the field of memory that stores the port state is unlikely (of course, the name can be guessed, but to encapsulate such a mechanism is quite enough).

Attention must be paid to the functionNewPort(05-11). In addition to returning the memory area of the port itself, the argument returns, through its first argument, the future of the variable that initializes the cell. This is used to initialize Givenand Taken(36-37).

4.2.Procedure Send(17-22).

It should be noted that the memory cell of the port always points to an unknown node. And Sendat the next call, first of all, it guarantees that by linking to the previous node in a variable Ms, and in its place by placing a link to the node in an unbound variable Mr. Further, Sendthe value of the variable is formed Ms, which is obtained in the course of unification with a list tuple, in which the sent message is in the first place (which may be unrelated), and the future in the second Mr(again: the list is just a tuple, in the second place which can stand anything). A subsequent call Sendwill do the same with the variable.Mr, and this process will gradually turn the variable that was used to initialize the cell to NewPort, and the future of which was returned to the outside, into a list. In other words, it Sendreally sends messages to some queue.

4.3.Procedure Join(39-46).

The most important thing here is what this procedure applies to, working in a separate thread (48). It applies to Givenand Taken, which as a result of calls Sendgradually turn into lists. Actual parameters Joinare always future, therefore the unification operators (41-42) are executed only after linking Xsand Ysthat occurs in Send, turning these variables into references to tuples of the form Message | SomeFuture. After that, the Messagecomponents of the incoming ( Given) and outgoing ( Taken) streams are unified, andJoin called for future tails of lists (A bit like quantum mechanics, is it not? Maybe Einstein did not have enough knowledge of parallel programming to build a model of the Universe with hidden variables).

4.4. Eventually.

Any bounded variable exposed to a stream Givenwith the help of a Sendthread will be unified in the thread (48) with the corresponding unbound variable exposed to the stream Taken. Of course, the situation is completely symmetrical, and the names putand getare used for convenience. But queueserver is still asynchronous, which is useful.

5. Other interesting features of Oz

Oz offers among its other features (and many of them are extensive), offers also an interesting model of parallel programming. Which differs from the classical one adopted in Prolog, for example, in that it allows the programmer to define his own search and iterate search procedures. And, of course, the logical programming model adopted in Oz is parallel.

Source: https://habr.com/ru/post/63715/

All Articles