📜 ⬆️ ⬇️

Concurrent programming for beginners in the Elixir / Erlang VM YaP using the example of the “Euler's horse” task



Introduction


A little more than a year ago I did a very important act in my life - I downloaded the first program in my life from the Microsoft IDE Visual Studio website and wrote, strangely enough, “Hello, World!”. Over the next six months, I read the well-known Stroustrup book, got a job as a junior C ++ developer, tried to write in Lua, Python, but did not achieve any significant success - my libraries did not work, programs hardly compiled and fell at runtime, pointers indicated not to those areas of memory (which, by the way, has always flowed away somewhere), and attempts to use more than one thread (C ++ 11!) led to memory damage and deadlock. About how the code looked, it is better just to remain silent.

What is it for me? In addition, in my personal opinion / experience, imperative languages, due to their peculiarities, are completely inappropriate for novice developers. Without knowledge of industrial programming patterns, any information about the operation of the operating system and elementary code culture, it is very difficult to write something bearable on them. They give too much freedom and space for crutches and bicycles, while functional languages ​​severely restricting the developer in some things leave him not many opportunities to write bad code, forcing him to think and develop.
')
About six months ago, I realized that it was time to change something, and after half an hour of searching on the Internet I found the specifications of the YaR Erlang. In the article, the author presented Erlang as a “wonderful pill” from all my problems described above, and in general, for the most part, he turned out to be right. So I started to program on Erlang, and then on Elixir.

Elixir language


Elixir is a language built on top of Erlang, the result of the compilation is Erlang VM bytecode. It favorably differs from Erlang by the simplicity of its syntax and powerful tools for meta-programming (people familiar with Lisp immediately recognize the quote-unquote constructs). Accordingly, all Erlang functionality, any of its modules and, most importantly, the OTP framework are available for use.

The data types are the same as in Erlang. Data is immutable, the result of actions with them is new data. In Elixir, as in many functional languages, the principle of "Everything - expression" works. Any expression will return a value.

YP Elixir has an excellent interpreter, which is installed along with the language, you can try out examples in it.

Basic data types


int, float
1 + 1 # => 2 3 / 2 # => 1.5 1.0 + 3 # => 4.0 


binary
 "Hello"<>" World!" # => "Hello World!" "Hello #{World!}" # => "Hello World!" """ hello """ # => "hello\n" 


atom
A constant that represents only its value and nothing more. There is no separate logical type: true, false, nil are also atoms, just for them there are corresponding conventions in the language.
 is_atom(true) # => true is_atom(:true) # => true is_atom(:hello) # => true 


list
 [1,2,3,4,"five"]++[6, "seven"] # => [1, 2, 3, 4, "five", 6, "seven"] [1,1,2,3,4]--[1,2,3,5] # => [1, 4] 

For historical reasons, the lines in Erlang are presented in the form of a list, and not in the form of a binary (in Elixir, like vice versa), and therefore the list can also be represented as
 is_list 'this is list' # => true 'lst'++[10000] # => [108, 115, 116, 10000] 


tuple
Something like list, the difference in the organization of data storage in memory, and accordingly in the library tools for working with data.
 {:hello, "world"} 


map
 %{2 => 2, :c => {1, 2, 3}, "key1" => 1} #      ,  ,     %{a: 1, b: 3, c: "qweqwe"} 


PID
VM process ID Erlang. The interactive shell of the interpreter is also a process, so we use the self function, which returns the PID of the process from which it was called.
 self # => #PID<0.95.0> (     ) 


It is possible to explicitly do transforms of some types.
 :erlang.tuple_to_list {1,2,3} # => [1, 2, 3] :erlang.integer_to_binary 123 # => "123" :erlang.binary_to_list "abc" # => :abc 


Also, one of the advantages of the language is that any term can be completely painlessly converted to binary and vice versa. Serialization of any data in one line.
 res = :erlang.term_to_binary [:hello, {"world", '!!!'}] # => <<131, 108, 0, 0, 0, 2, 100, 0, 5, 104, 101, 108, 108, 111, 104, 2, 109, 0, 0, 0, 5, 119, 111, 114, 108, 100, 107, 0, 3, 33, 33, 33, 106>> :erlang.binary_to_term res # => [:hello, {"world", '!!!'}] 


Pattern matching



Data can be locally assigned to variables (global variables and shared memory are not and cannot be here)
 x = 1 # => 1 #  x  c 1 x # => 1 #         (  ) x = 1+x # => 2 


But to do this is absolutely not necessary. The code is easier, more beautiful and more expressive if you do not use variables at all . Strictly speaking, the "=" operator in the Elixir language is not an assignment in the usual sense for imperative languages. Here there is a pattern matching (pattern matching). Its essence is easier to show by example
 %{a: x, b: 1} = %{a: 1, b: 1} # => %{a: 1, b: 1} %{a: x, b: 1} = %{a: 1, b: 2} # => ** (MatchError) no match of right hand side value: %{a: 1, b: 2} %{a: x, b: 1} = %{b: 1} # => ** (MatchError) no match of right hand side value: %{b: 1} 


In terms of imperative languages, pattern matching is a combination of comparison and assignment: the elements of a structure connected to a value to the left of = are compared with the corresponding elements of the structure to the right of =, and not connected - are associated with the corresponding values ​​of the structure's elements on the right . It is logical that the structure on the right cannot have elements unrelated to any value. If one of the comparisons returns false, an exception is thrown.

Pattern matching can also (and should) be done on binary data.
 << a :: binary-size(5), rest :: binary >> = "hello, world" # => "hello, world" a # => "hello" rest # => ", world" 


By the way, pay attention to this example:
 x = 1 # => 1 x == 1 # => true %{a: x, b: y} = %{a: 12, b: 1} # => %{a: 12, b: 1} x # => 12 


Pattern matching was successful, there is no exception. Although it would seem, x should be associated with 1, and 1! = 12. In this sense, Elixir differs from more stringent functional languages ​​(including Erlang), which is why using pattern matching inside functions often leads to confusion and clutter. code; this should be avoided. The real power of pattern matching can only be felt if used in function declarations and case expressions.

Functions and modules


The language is functional, therefore the main expressive means of the language is function. Functions can be taken as functions as arguments and returned as values. Functions are defined inside modules, modules can also be defined inside other modules.

 defmodule Some do defp priv_func(arg) do arg<>"Hello from priv_func!" end def pub_func(arg) when is_binary(arg) do arg<>priv_func("Hello from pub_func!\n") end end 


To define a public function that can be called outside of this module — you need to use def, and defp allows you to redistribute a function that is available only inside this module (see how much simpler it is compared to Erlang). After the function name and arguments, there can be expressions called guard expressions (look at the definition of pub_func), they allow you to narrow the scope of the function (in the mathematical sense).

 Some.pub_func "Hello from shell!\n" # => "Hello from shell!\nHello from pub_func!\nHello from priv_func!" Some.pub_func 1 # => ** (FunctionClauseError) no function clause matching in Some.pub_func/1 Some.priv_func "Hello" # => ** (UndefinedFunctionError) undefined function: Some.priv_func/1 Some.priv_func("Hello") 


There are 2 almost equivalent ways to define a lambda (anonymous function) in the Elixir language:

 sum = &(&1+&2) # => &:erlang.+/2 sum.(1,2) # => 3 sum = fn(x,y) -> x + y end # => #Function<12.106461118/2 in :erl_eval.expr/5> sum.(1,2) # => 3 


As you can see from the example, a call to a lambda differs from a call to a normal function only by the “.” Functions can be passed as arguments not only lambdas, but also ordinary functions. To do this, put a “&” sign in front of the function name, and then specify its arity.

 defmodule Some do def actor(arg1, arg2, func) do func.(arg1, arg2) end def div(arg1, arg2) do arg1 / arg2 end end sum = &(&1+&2) # => &:erlang.+/2 Some.actor(1,2,sum) # => 3 Some.actor(3,4, &Some.div/2 ) # => 0.75 


Detailed documentation on the Yap Elixir and its standard modules can be read here , and on Erlang / OTP here .

Solution of the problem “Euler's horse”


When I first started learning C ++ and was preparing for the entrance exams to the master's degree at the VMK (naively believing that they would learn how to code cool), in one of the options for the entrance exams I came across this task. Then I could not solve it. For some reason, yesterday she remembered me again, and, throwing a decision in an hour, decided to write this article.

In general, the essence: “There is a square chessboard of arbitrary size, a knight stands on it in an arbitrary cell, there are no other pieces on the board. It is necessary to pass a horse through all the cells of a chessboard, having been in each one exactly once, or to prove that this cannot be done. ”

Since there are no specific conditions determining the size of the board and the starting position of the knight, it is quite difficult, tedious and not rational to make abstract mathematical conclusions here, so the most obvious solution is brute force.

To begin with, we will define the data structures that we will operate on in our functions. This will allow to achieve a higher level of abstraction and significantly simplify the solution. In addition, the structures thus defined make the code more deterministic.

 # structs of data defmodule Position do defstruct first: 1, second: 1 end defmodule GameState do defstruct current_pos: %Position{}, path: [] end 


Position - the position of the horse on the board at any time. Here we consider the two-dimensional case, but as will be seen later, the functional code is designed in such a way that this solution can be very easily generalized for a space of any dimension.
GameState - the current state of the game, is uniquely determined by the current position and the distance traveled.
As you can see, we define defaults for structure fields, so we get something like a class constructor. The structures in Elixir are based on the map data type, and are very similar in usage / syntax.

 defmodule Some do defstruct a: 1, b: nil end res = %Some{} # => %Some{a: 1, b: nil} is_map res # => true Map.keys res # => [:__struct__, :a, :b] Map.values res # => [Some, 1, nil] 


Next, we write the solution in general. The init / 1 function takes the size of the edge (in this case, the side of the square) of the board as an argument, randomly determines the starting position of the knight (and the game state accordingly), informs the user about it using the function inform_user_about_beginning / 1, then calls the game / 2 function, which returns many possible workarounds, and then the show_game_results / 2 function, which tells the user about the results. Pay attention to the operator "|>". It simply passes the expression on the left as the first argument of the function on the right. The problem is solved, it remains to determine the functions that are not yet defined.

 def init(limit) when (is_integer(limit) and (limit > 0)) do :random.seed(:erlang.now) [ %GameState{ current_pos: %Position{ first: :random.uniform(limit), second: :random.uniform(limit)} |> inform_user_about_beginning } ] |> game(limit) |> show_game_results(limit) end 


For the function inform_user_about_beginning, I think everything is clear - it takes an argument, displays it on the screen and returns it.

 defp inform_user_about_beginning info do IO.puts "Hello, user, we begin from #{inspect info}" info end 


The show_game_results function is a bit more complicated. As a result of our algorithm, we need to get a list of possible workarounds. Naturally, we want to see different messages for the empty and non-empty list. Instead of if-else or case expressions inside one function, in this case it is better to write two separate clauses for the show_game_results / 2 function. This simplifies the code, makes it more visual and readable. The general rule is this: when a function is called, clauses for a given function begin to get over, in the order in which they are written out in the code. In this case, pattern matching of all the arguments of the function in the given clause and the corresponding arguments passed from the outside is performed. If pattern matching succeeds, the function returns the expression of this clause, if not, the next clause is taken. If no clause came up, an exception is thrown.

 defp show_game_results([], limit) do IO.puts "FAIL. This is no any way to go through desk #{limit}x#{limit} from this position." end defp show_game_results([%GameState{current_pos: current_pos, path: path} | rest_way], limit) do """ SUCCESS! There are #{length(rest_way)+1} ways to go through desk #{limit}x#{limit} from this position. Here one of them: """ |> IO.puts Enum.each( path++[current_pos] , &(IO.puts "\t#{inspect &1}") ) end 


Pay attention to the second clause, it uses a split list on the head and tail, typical of FP. In this case, the purpose of such a partition is to get the first workaround from the list of found workarounds right in the function argument, saving yourself from having to do it somewhere in the function body. In Elixir, both in and in Erlang, the list can either be empty ([]), or be split into head and tail, where the tail is a list.

 lst = [1,2,3] # => [1, 2, 3] [a | rest] = lst # => [1, 2, 3] a # => 1 rest # => [2, 3] lst2 = [0 | lst] # => [0, 1, 2, 3] [first|rest] = [0] # => [0] first # => 0 rest # => [] 


Also at the end of the second clause is a higher order function. In fact, the function Enum.each / 2 here passes through all the elements of the list and applies to each element a function which it itself takes as the second argument (in this case it simply displays on the screen). Further in the text there will be several more functions from the Enum module, so that there are no questions about this at once I briefly describe how they work:

 Enum.map( lst, func/1 ) #  ,    func(el),  el -   lst Enum.filter(lst, func/1) #      lst,   func(el) == true Enum.reduce( lst, res, func/2 ) #   Enum.reduce(rest, func(el, res), func/2),  [el | rest] = lst,  Enum.reduce([], some_res, func/2) == some_res 


Now we define the missing game / 2 function. If we get an empty list of possible game states (and therefore workarounds), we have nothing left to do with it except to return. If the list is not empty, we check whether the end of the detour has reached the end, and depending on this, either we return the list of the detour paths, or continue the detour.

 defp game([], _limit) do [] end defp game(lst, limit) do case game_over?(lst, limit) do true -> lst false -> Enum.map(lst, &(generate_new_list_and_game_next(&1, limit))) |> List.flatten end end defp game_over?([%GameState{path: path}| _rest], limit) do length(path) == (limit*limit - 1) end 


In the second clause of the game / 2 function, there is a case expression. At first glance, it resembles a switch switch, but in reality, case is, by its nature, practically the same as the usual function in the Elixir language. The bottom line is:

 case (_0) do _1 -> _1 _2 -> _2 ... _n -> _n end 


The serial pattern matching of each clause is made starting with the first with the result of executing the expression 0, and if it passed successfully, the case returns the expression corresponding to this clause. If no clause came up, it should be an exception.

Next, we need to define the function generate_new_list_and_game_next / 2, which will accept the state of the game at step n, convert it to the list of game states at step n + 1 (after all, from any cell, the knight can make a move from 0 to 8 for any number of cells, depending on the state in step n), and then pass this list to the game / 2 function. To write this function, you first need to know how the knight walks. Regardless of the initial conditions, all possible movements of the knight are known to us even before the algorithm starts working (for a queen, for example, this is not true - in this case, the set of possible movements is related to the size of the board). Therefore, the work of calculating all theoretically possible moves of a horse can (and should) be placed in the compile-time. To do this, we write in a separate module the function make_permutations / 4. The first def does not contain a do-end block and is used to enter default arguments.

 defmodule Permutations do def make_permutations( input_set, perm_size, condition, result \\ []) def make_permutations( _input_set, perm_size, condition, result ) when (length(result) == perm_size) do case condition.(result) do true -> List.to_tuple(result) false -> :failed end end def make_permutations( input_set, perm_size, condition, result ) when (length(input_set)+length(result) >= perm_size) do Enum.map( input_set, fn(input_el) -> make_permutations( input_set--[input_el], perm_size, condition, result++[input_el] ) end ) |> List.flatten |> Enum.filter(&(&1 != :failed)) end end 

In the function make_permutations / 4, we simply get all permutations of elements from the set input_set of length perm_size, which satisfy the condition condition.
And in the module in which we want to use the result of specific calculations, we simply write:

 @horse_ways Permutations.make_permutations([-1, -2, 1, 2], 2, fn(lst) -> Enum.reduce(lst, 0, &(abs(&1)+&2)) == 3 end ) 


@ - module attribute designation. The expression accepted by the module attribute is calculated when the module is compiled.
Now we are ready to write the missing code. Here everything is completely trivial, and the names of functions and arguments speak for themselves.

  defp generate_new_list_and_game_next(game_state = %GameState{current_pos: current_pos}, limit) do @horse_ways |> Enum.map( &(generate_new_position(current_pos, &1)) ) |> Enum.filter(&( can_go_here?(game_state, &1, limit) )) |> Enum.map(&( go_here(game_state, &1) )) |> game(limit) end defp generate_new_position( %Position{first: first, second: second}, {delta1, delta2} ) do %Position{first: (first+delta1), second: (second+delta2)} end defp can_go_here?( %GameState{current_pos: current_pos, path: path}, prompt = %Position{first: first, second: second}, limit ) do not(prompt in [current_pos | path]) and Enum.all?([first, second], &( (&1 <= limit) and (&1 > 0) )) end defp go_here( %GameState{current_pos: current_pos, path: path}, prompt ) do %GameState{current_pos: prompt, path: path++[current_pos]} end 

Full listing:
full listing
 defmodule Permutations do def make_permutations( input_set, perm_size, condition, result \\ []) def make_permutations( _input_set, perm_size, condition, result ) when (length(result) == perm_size) do case condition.(result) do true -> List.to_tuple(result) false -> :failed end end def make_permutations( input_set, perm_size, condition, result ) when (length(input_set)+length(result) >= perm_size) do Enum.map( input_set, fn(input_el) -> make_permutations( input_set--[input_el], perm_size, condition, result++[input_el] ) end ) |> List.flatten |> Enum.filter(&(&1 != :failed)) end end defmodule Horse.Solution do ######################### ### compile-time work ### ######################### @horse_ways Permutations.make_permutations([-1, -2, 1, 2], 2, fn(lst) -> Enum.reduce(lst, 0, &(abs(&1)+&2)) == 3 end ) # structs of data defmodule Position do defstruct first: 1, second: 1 end defmodule GameState do defstruct current_pos: %Position{}, path: [] end #################### ### runtime work ### #################### def init(limit) when (is_integer(limit) and (limit > 0)) do :random.seed(:erlang.now) [ %GameState{current_pos: %Position{ first: :random.uniform(limit), second: :random.uniform(limit)} |> inform_user_about_beginning } ] |> game(limit) |> show_game_results(limit) end defp inform_user_about_beginning info do IO.puts "Hello, user, we begin from #{inspect info}" info end defp game([], _limit) do [] end defp game(lst, limit) do case game_over?(lst, limit) do true -> lst false -> Enum.map(lst, &(generate_new_list_and_game_next(&1, limit))) |> List.flatten end end defp game_over?([%GameState{path: path}| _rest], limit) do length(path) == (limit*limit - 1) end defp generate_new_list_and_game_next(game_state = %GameState{current_pos: current_pos}, limit) do @horse_ways |> Enum.map( &(generate_new_position(current_pos, &1)) ) |> Enum.filter(&( can_go_here?(game_state, &1, limit) )) |> Enum.map(&( go_here(game_state, &1) )) |> game(limit) end defp generate_new_position( %Position{first: first, second: second}, {delta1, delta2} ) do %Position{first: (first+delta1), second: (second+delta2)} end defp can_go_here?( %GameState{current_pos: current_pos, path: path}, prompt = %Position{first: first, second: second}, limit ) do not(prompt in [current_pos | path]) and Enum.all?([first, second], &( (&1 <= limit) and (&1 > 0) )) end defp go_here( %GameState{current_pos: current_pos, path: path}, prompt ) do %GameState{current_pos: prompt, path: path++[current_pos]} end defp show_game_results([], limit) do IO.puts "FAIL. This is no any way to go through desk #{limit}x#{limit} from this position." end defp show_game_results([%GameState{current_pos: current_pos, path: path} | rest_way], limit) do """ SUCCESS! There are #{length(rest_way)+1} ways to go through desk #{limit}x#{limit} from this position. Here one of them: """ |> IO.puts Enum.each( path++[current_pos] , &(IO.puts "\t#{inspect &1}") ) end end 



Let's try to run. It should turn out something like this:

image

Or such, if not lucky with the initial conditions:

image

How quickly did you get the result for the 5x5 board? And 6x6? Not very fast. As the top shows us, Erlang VM loads only one core.

image

Time for parallelization of calculations.

Parallel optimization of problem solving



Creating a new process in Erlang VM is easy. The spawn and spawn_link functions launch a function in the new process that is taken as the argument and returns the PID of the child process. The process terminates after this function returns a value. Using spawn_link is generally more preferable, since in this case, if there was an exception in the child process, it will be forwarded to the parent process, and vice versa, which makes program execution more deterministic.

 spawn_link fn() -> IO.puts "hello from #{inspect self}" end # => #PID<0.80.0> spawn_link fn() -> 109 / 0 end # => ** (EXIT from #PID<0.76.0>) an exception was raised spawn fn() -> 109 / 0 end # => #PID<0.85.0> 


The obvious place to optimize is the Enum.map/2 function anywhere in our code — all you need to do is to process each element of the list in a separate process, and then compile a new list from the obtained values. Why is this not implemented directly in the core of the language?
The first possible reason is that in the general case no one guarantees that a pure function will be passed to Enum.map. Alas, sometimes it matters in what order to do the calculations. Such "dirty" functions, the result of which (in a global sense) depends not only on arguments, but must be avoided, but if it cannot be avoided, it can at least be localized.
The second possible reason is the limited number of processes. If we allow such a hypothetical parallel-running Enum.map to be somehow called recursively, then the number of processes will start to grow like an avalanche. Erlang virtual machine in this regard is a very stable thing, and the processes themselves are very cheap. But everything in this world has a limit.
However, right now we will write our own, parallel (and correctly), working Enum.map (although the purity of the functions transferred to Enum.map will remain on the conscience of the one who will use it).

Formally, processes in Erlang VM do not have shared data, they are completely isolated, and can only send messages to each other asynchronously. The message can be any Erlang term. Messages are sent by the send / 2 function, and are received using receive-expressions that are very similar to case expressions. The difference is that if a suitable clause was not found for the receive expression (i.e. there are no messages of the expected types in the mailbox), the exception is not thrown, but the expression “freezes” until a suitable message arrives (it is also possible to specify a certain timeout ).

 defmodule Some do def receiver do receive do "hello" -> IO.puts "hello from receiver" receiver "please, die" -> IO.puts "OK ... " end end end child = spawn_link &Some.receiver/0 # => #PID<0.182.0> send child, "hello" # =>   hello from receiver send child, "some message" # =>      send child, "hello" # =>   hello from receiver send child, "please, die" # =>   OK ... 


As you can see, everything is very simple. But there is one "but" about which we have already spoken.Erlang VM can support simultaneously thousands, tens and even hundreds of thousands of processes. But this number is all the same of course, and this must be taken into account. For such accounting, we need one for the entire Erlang VM process counter. It is arranged very simply - when accepting the corresponding messages from other processes, it should be able to magnify by 1, decrease by 1, or send its value to the process that requested this value. How to store the state of an abstract object in functional languages ​​where there are no shared memory and global variables? The correct answer is recursion.

  def parallel_enum_control_system number \\ 1 do receive do %{ from: sender, query: :get_number} -> send sender, number parallel_enum_control_system number :increment -> parallel_enum_control_system number+1 :decrement -> parallel_enum_control_system number-1 some -> IO.puts "WARNING! Unexpected message in parallel_enum_control_system: #{inspect some}" parallel_enum_control_system number end end 


But it is inconvenient to transfer the PID of the process counter to each process that wants to send a message to the process counter. Especially considering that ParallelEnum.map can (and in our case will) indirectly call itself recursively. We will register its PID under some name (pseudonym) and we will send messages using an alias. In the listing below you can see what happens when you call ParallelEnum.map - if the process under the name of the process counter is not registered, start it and register it. I use if instead of case in order to emphasize that the result that this expression returns in this case is not important, and the expression is executed solely for the sake of side-effect. Note also that instead of spawn_link, spawn is used here. This is not an accident.The fact is that initially we did not make any assumptions about where, when and by whom the ParallelEnum.map/2 function will be used. If somewhere in Erlang VM 2 ParallelEnum.map/2 functions are performed simultaneously, and one of them for some reason throws an exception, then the counter-process will fall, creating obvious prerequisites for unpredicted behavior and for “healthy” process, which is simply not fortunate enough at the moment to perform the function ParallelEnum.map/2. That is why the process counter should not be linked (linked) with any other process.

  def map lst, func, limit \\ 2 do if (:erlang.whereis(@controller) == :undefined ) do :erlang.register( @controller, spawn(ParallelEnum, :parallel_enum_control_system, [1]) ) end Enum.map(lst, &(try_make_child(&1, func, limit))) |> Enum.map( &collect_results/1 ) end 

After the if-expression, in the body of the ParallelEnum.map/2 function is the classic map-reduce (for some irony, the map-map is formally here, but do not rush to conclusions). So, according to Wikipedia, this pattern consists of two steps.

Step Map


At the Map step, we check the number of running processes related to ParallelEnum according to the counter process. If the specified limit for the number of processes is not exceeded, we create a child process, transfer to it our PID (so that it can return the results to us), as well as a function with arguments that the workflow must perform. If the limit is exceeded, the parent process executes the expression itself.

A few words about why we use the IntermediateResult structure. If there were no restrictions on the number of processes, we could simply compile a list of the PIDs of the child processes and then wait for the responses from them in turn. But due to the restriction, the parent process in some cases will have to do the work itself, and this structure will help us at the reduce step to understand whether to wait for the results from the child process.

  defmodule IntermediateResult do defstruct child_pid: nil, my_own_result: nil end defp try_make_child(arg, func, limit) do case (get_number_of_processes < limit) do false -> %IntermediateResult{child_pid: nil, my_own_result: func.(arg)} # in this case do work yourself haha true -> %IntermediateResult{child_pid: spawn_link(ParallelEnum, :worker, [self, func, arg]), my_own_result: nil} end end defp get_number_of_processes do send @controller, %{ from: self, query: :get_number } receive do num when is_integer(num) -> num end end 


The working process


There is absolutely nothing complicated here: right after launch, we send an increment signal to the counter, perform the function, send the results to the spawning process, and before completing, send the decrement signal to the counter.

  def worker(daddy, func, arg) do send @controller, :increment send daddy, %{ from: self, result: func.(arg)} send @controller, :decrement end 


Reduce step


Here, too, everything is pretty transparent. The IntermediateResult structure obtained after the map step unequivocally shows what needs to be done at the reduce step — take the ready result (in case the parent process did the work itself), or wait for the message with the result from the child process.

  defp collect_results( %IntermediateResult{child_pid: nil, my_own_result: result}) do result end defp collect_results( %IntermediateResult{child_pid: pid, my_own_result: nil}) do receive do %{ from: incoming_pid, result: result} when (incoming_pid == pid) -> result end end 


Now it remains to replace the Enum.map function with the ParallelEnum.map function, for example, in the game function, and it's done. The final version of the code can be found here .

Performance tests


So, the problem “Euler's horse” is solved and the solution is optimized. Now everything works quickly and loads all processor cores to its fullest.

image

Time for tests. During the discussion of parallel optimization, almost nothing was said about what the limitation of the number of processes for this task should be. We will slightly change our functions from the Horse.Solution module in order to be able to uniquely specify the required number of processes and the initial position of the horse (for definiteness). And let's test the execution time of the Horse.Solution.init function for a 5x5 board, starting position 1.1 and different values ​​of the maximum number of processes using this function:

  def test_of_performance do Enum.each( 1..25, fn(num) -> test_of_performance_process(num) end ) Enum.each( [50, 75, 100, 150, 200, 250, 500, 1000], fn(num) -> test_of_performance_process(num) end ) # to test this, add +P 100000000 flag when starting iex Enum.each( [10000, 50000, 100000, 500000, 1000000, 5000000, 10000000], fn(num) -> test_of_performance_process(num) end ) end defp test_of_performance_process(num) do {res, _} = :timer.tc( fn() -> init(5, num, %Position{}) end ) File.write "./test_results", "#{num} #{res}\n", [:append] end 


Measurements showed a minimum of time (maximum performance) on the number of working processes 24.

image

Why so much


After all, the kernels on our local machine are much smaller. Because Erlang VM processes are not OS processes in the usual sense of the word. It does not even flow. A vitreal machine fully encapsulates them in itself, and evenly distributes the load across the processor cores (if possible).

Why so little


Erlang VM processes cost very few resources, but still cost. As the number of processes increases, the total overhead also grows; moreover, evenly distributing the load on the processor cores at around 24 processes, the virtual machine cannot “even more evenly” distribute the load - here it rests on the physical limitations of the real processor.

In our particular case, it is still worth remembering about the process counter. When a parent process decides whether to create a child process or do the work itself, it requests from the counter process the number of processes already running. Here he needs not only to send a message to the counter, but also to wait for a response from him. Process - one counter. Work processes - hundreds of thousands. I think everything is clear.
Why productivity started to grow again after marking 1,000,000 processes is a mystery to me.

Conclusion


Erlang is a unique phenomenon in the world of multi-threaded programming. In 1987, when there wasn’t even Java, but I thought I could only dream about simple multithreaded programming in C ++ - the developers at Ericsson had already written fault-tolerant programs for use in telecommunications. The Erlang language and its virtual machine were created for purely utilitarian purposes - to solve specific telecom tasks, over the years of its existence, the language evolved on real tasks (unlike, for example, Haskell). In the process of development, it was “overgrown” with a multitude of well-functioning and documented libraries, the OTP (Open Telecom Platform) framework appeared. OTP for Erlang is more than, say Qt for C ++.Remember about the send function and receive-expression? In Erlang VM, there is actually only asynchronous interprocess communication. For large projects, writing such low-level code is not very convenient, OTP has tools that encapsulate low-level details and provide a very simple, convenient and intuitive interface - there is not only asynchronous exchange, but, for example, synchronous, error handling tools / falls and more. In this article I did not use OTP just because it is a topic for a separate large article. For stability and fault tolerance, Erlang VM has no equal - remember, when we tested the performance for a different number of processes, the last value was 10,000,000! At the same time, performance, of course, fell compared to 24, but in no way fatal way. Yes, arithmetic in Erlang VM is "tug",but as they say, if you want fast arithmetic, Fortran is in your hands. With this figure I just wanted to say that for Erlang VM you can write programs in which you need to simultaneously servereally a lot of processes. As an example, a web server or something similar comes to mind. In addition, the performance of Erlang VM is very good, in principle it is soft-realtime. In general, the Erlang / Elixir + OTP YaP can become your indispensable assistant for writing stable, distributed and very reliable programs.

UPD: Lisp version



A couple of days ago I decided to join simultaneously with Lisp and JVM, and rewrote this task in the Clojure language. Actually about the elegance / readability of the code here you can argue endlessly, but the fact that the speed of work here is also quite acceptable.



The code is here .
For those who want to still understand where Elixir actually grows legs - I advise you to get acquainted with some Lisp dialect, really a lot falls into place.

Source: https://habr.com/ru/post/233637/


All Articles