📜 ⬆️ ⬇️

Pure and deterministic functions

Translation of an article by Justin Etheridge (Justin Etheredge), in which the author explains the subtle difference between deterministic and pure functions.

Yesterday I was reading Matt Podwysocki’s blog (by the way, this blog is awesome, go and subscribe), and it has a post “Recursing into Recursion - Memoization” . Excellent post if you want to get acquainted with the memo. I already had a post about the generalized function of memoization some time ago, so we will not speak about memoization. What aroused my interest was at the end of Matt’s article:

Do not forget that the correct memoization requires that the function you write is clean. This means that the same results will be calculated and returned for the same input data.

')
My first thought was, “Hey, Matt, purity means that the function has no side effects, and determination means that the function always returns the same results for these parameter sets.” Then, like any good programmer, my second thought was, “Actually, maybe I'm wrong.” Therefore, I went and studied this question, and it is not surprising that I was mistaken.


Here is a description of the net function from wikipedia:
  1. The function always calculates the same result value for the same sets of arguments. The resulting value of the function cannot depend on any hidden information or state , which may change as the program runs, cannot depend on any external input from input / output devices .
  2. The result calculation should not cause any semantically observable side effects , such as mutation of mutable objects, or output to I / O devices.

But the definition of a deterministic algorithm by the National Institute of Standards and Technology (USA):

An algorithm whose behavior can be completely predicted from input data.

Therefore, pure functions are in fact a subset of deterministic functions. All pure functions are deterministic, but not all deterministic functions are pure. For example, we could have a function that takes a certain set of parameters and returns a deterministic result and also writes values ​​to disk or prints to the console. By definition, such a function will not be clean, although it will be deterministic.

For example:

Deterministic and pure:

public int Add(int val1, int val2) { return val1 + val2; } 


Deterministic but not pure:

 public int Add(int val1, int val2) { Console.WriteLine(val1 + " " + val2); return val1 + val2; } 


Deterministic functions also cannot use the external state, because it will affect their functioning. Most determinations of deterministic functions say that you can determine the result from input data.

As Matt noted in his post, for memoization it is very important that the function be deterministic. This is obvious: due to the fact that we have a lot of input data and then we cache a lot of results, we hope that the function will return the same results after each call with these input data, otherwise we will change the function behavior with our memoration.

Purity of functions is important for many reasons, but primarily because it allows us to more easily talk about the behavior of a function. If the function is clean and has no side effects, then we can more confidently talk about the performance of this function, since we will need to consider fewer variables. Also, you cannot effectively force a function with side effects to run in parallel without complex analysis of its behavior. A pure function should not depend on anything other than the values ​​passed to it. Such a function does not change any split state and therefore can be executed in parallel without using locks.

If you didn’t know what pure functions are, I hope you now have a good idea. I also hope that if you had any misconceptions about them, like mine, now everything has become clear.

It's really great to learn something new, but even better to know that your knowledge was wrong.

Source: https://habr.com/ru/post/149086/


All Articles