⬆️ ⬇️

Troika Hoare

I have been using the Hoare logic for more than 15 years in programming and find this approach very useful and I want to share my experience. Naturally, it is not necessary to “shoot a gun at the sparrows”, but when writing fairly complex algorithms or nontrivial pieces of code, applying Hoare’s logic will save you time and allow you to add elements of some “industrial” standard during programming.



Hoar's logic contains a set of axioms for the basic constructions of an imperative programming language: empty operator, assignment operator, compound operator, branch operator, and cycle. There are many formulas in them and so that they do not cause rejection, I will first ask a small puzzle.



Binary exponentiation algorithm



Consider the recursive version. We assume that the problem has a solution if a , n are not equal to zero at the same time and n> = 0 .

I don’t like pseudo-code (I’ll probably write a separate topic why), so I’ll give an example of a Python solution:



def power(a, n): assert n > 0 or (a != 0 or n != 0) if n == 0: return 1 else: a2 = power(a, n/2) if n & 1: return a*a2*a2 else: return a2*a2 


And in C / C ++ :

')

 int power(int a, int n) { assert(n > 0 || (a != 0 || n != 0)); if (n == 0) return 1; else { int a2 = power(a, n/2); if (n & 1) return a*a2*a2; else return a2*a2; } } 


Now try writing a non-recursive version. In case of luck, try to explain to someone how it works.



The main idea of ​​Hoare is to give for each construction an imperative language pre and a post condition written in the form of a logical formula. Therefore, there is a triple in the name - a precondition , a language construction, a post - condition .





A properly working program can be written in many ways, and it will also be effective in a large number of cases. This arbitrariness and precisely it complicates programming. For this style is introduced. But this is not enough. For many programs (for example, related indirectly to human life), it is necessary to prove their correctness. It turned out that the proof of correctness makes the program more expensive by an order of magnitude (about 10 times).



Hoare actually suggested:

Let's use arbitrariness when writing programs and write them so that it is easier to prove their correctness. As a result, the program is easier to write, and the proof of correctness is immediately obtained.

These are my words.



About the proof of the correctness of the algorithm. There is a set of tools in mathematics that allows one to prove theorems. I will call well-known: the method of mathematical induction, by contradiction. So, Hoar's logic is a tool for proving the correctness of programs (more correctly partial correctness, but these are nuances), which in principle can be used to prove the correctness of the algorithm if it is written in the basic constructions of imperative programming for which there are Troika Hoare.



Full version
I included the pre and post conditions in the comments. Before a cycle, I define it through : cycle invariant and bounding function, and then I simply refer to them. Python :



 # -*- coding: utf-8 -*- #      def power(a, n): """  a   n """ assert n > 0 or (a != 0 or n != 0) r, a0, n0 = 1, a, n # r == 1 and a0 == a and n0 == n and n >= 0 # _: a0**n0 == r*a**n # _(n): n while n > 0: # n > 0 and # _(n) > 0 and # _ if n & 1: r *= a #(n -  and a0**n0*a == r*a**n) or # _ n >>= 1 # _(n) > _(n/2) and # a0**n0 == r*a**(2*n) a *= a # _ # n == 0 and # _ and # _(n) == 0 # a0**n0 == r return r if __name__ == '__main__': for i in range(14): print "2**%d" % i ,"=", power(2, i) print "2**%d" % i ,"=", power(-2, i) 


C / C ++ :

 #include <cassert> #include <cstdlib> #include <iostream> //      int power(int a, int n) { //  a   n assert(n > 0 || (a != 0 || n != 0)); int r = 1, a0 = a, n0 = n; /* r == 1 and a0 == a and n0 == n and n >= 0 _: a0**n0 == r*a**n _(n): n */ while(n > 0) { /* n > 0 and _(n) > 0 and _ */ if (n & 1) r *= a; /* (n -  and a0**n0*a == r*a**n) or _ */ n >>= 1; /* _(n) > _(n/2) and a0**n0 == r*a**(2*n) */ a *= a; // _ } /* n == 0 and _ and _(n) == 0 */ // a0**n0 == r return r; } int main(int argc, char *argv[]) { for(int i=0; i < 14; i++) { std::cout << "2**" << i << " = " << power(2, i) << std::endl; std::cout << "(-2)**" << i << " = " << power(-2, i) << std::endl; } return EXIT_SUCCESS; } 




Reasonable version
It is clear that it does not make sense to write in such detail with ordinary programming. Too cumbersome, the code is cluttered with comments and becomes almost unreadable. All these logical reasoning should be carried out in my head, leaving the most non-trivial in the form of comments. Then it will always be possible, when returning to the code, to quickly restore all the reasoning and most importantly. Another programmer who owns this approach, in the same way, reasoning and examining your code with the most important comments left, will be able to restore your reasoning and, for example, find an error in them or change the code for a new problem statement. And these are elements of an industrial approach.



The most important and non-trivial in the example with binary exponentiation was the cycle invariant. I would write as follows.



Python :



 # -*- coding: utf-8 -*- #      def power(a, n): """  a   n """ assert n > 0 or (a != 0 or n != 0) r = 1 # _: a0**n0 == r*a**n while n > 0: if n & 1: r *= a n >>= 1 a *= a return r if __name__ == '__main__': for i in range(14): print "2**%d" % i ,"=", power(2, i) print "2**%d" % i ,"=", power(-2, i) 


C / C ++ :



 #include <cassert> #include <cstdlib> #include <iostream> //      int power(int a, int n) { //  a   n assert(n > 0 || (a != 0 || n != 0)); int r = 1; // _: a0**n0 == r*a**n while(n > 0) { if (n & 1) r *= a; n >>= 1; a *= a; } return r; } int main(int argc, char *argv[]) { for(int i=0; i < 14; i++) { std::cout << "2**" << i << " = " << power(2, i) << std::endl; std::cout << "(-2)**" << i << " = " << power(-2, i) << std::endl; } return EXIT_SUCCESS; } 


a0, n0 are initial values. Inside the cycle body r, a, n are changed in such a way that after the cycle body is executed, the relation (cycle invariant) a0 ** n0 == r * a ** n is true. At the end of the cycle, a0 ** n0 == r * a ** n is true and n == 0 means a0 ** n0 == r and r contains the answer.



Writing statements in the comments is certainly a matter of agreement, but I see no reason to write somewhere else. It is better to write a statement in the comments directly in the code to which they are directly related and sometimes, if possible, it is better to replace them with assertions.



Binary search



Try yourself when solving a very similar binary search problem. Here a is a linearly ordered array, v is the element number we are looking for, l is the lower bound with which we are looking for (it comes in), r is the upper limit up to which we are looking for (it is not included). For simplicity, we assume that the first call to bSearch does not go beyond the bounds of array a . Returns -1 if the search could not be found, or the number of the element found> = 0.



Python
 def bSearch(a, v, l, r): if l >= r: return -1 else: m = (l + r)/2 assert l <= m < r if v > a[m]: return bSearch(a, v, m + 1, r) elif v < a[m]: return bSearch(a, v, l, m) else: assert v == a[m] return m 




C / C ++
 int bSearch(int *a, int v, int l, int r) { if (l >= r) return -1; else { int m=(l + r)/2; assert(l <= m && m < r); if (v > a[m]) return bSearch(a, v, m + 1, r); else if (v < a[m]) return bSearch(a, v, l, m); else { assert(v == a[m]); return m; } } } 




Write a non-recursive version with Hoare triples.



Consider a more complex case with nested loops on the example of different sorting algorithms.



Pyramidal sorting



The pyramid sorting algorithm for an array of n- elements array consists of two main steps:



  1. We build array elements in the form of a sorting tree:

    ∀ i, 0 ≤ i ∧ (2i + 1) <n ∧ array [i] ≥ array [2i + 1],

    ∀ i, 0 ≤ i ∧ (2i + 2) <n ∧ array [i] ≥ array [2i + 2] .
  2. We will remove elements from the root one at a time and rebuild the tree. Those. in the first step, exchange array [0] and array [n-1] and convert array [0], ..., array [n-2] into the sorting tree. Then we rearrange array [0] and array [n-2] and transform array [0], ..., array [n-3] into the sorting tree. The process continues until one item remains in the sorting tree. Then array [0], ..., array [n-1] is an ordered sequence.


This is the original article of the author of the algorithm.

JWJ Williams. Algorithm 232 — Heapsort, 1964, Communications of the ACM 7 (6): 347–348.

This wiki description



Python
 # -*- coding: utf-8 -*- # C a    def heapsort(a): """ C a     """ k, n = 1, len(a) # k == 1 and # _1: a[(i - 1)/2] >= a[i]  i=1..k # _1(k, n): n - k while k < n: # k < n # and _1 # and _1(k, n) > 0 leaf = k k += 1 # k <= n # and _1  i == leaf # and _1(k, n) > _1(k+1, n) # _2: _1  i == leaf # _2(leaf): (_1  0)  leaf-1 while leaf > 0: # leaf > 0 # and _2 # and _2(leaf) > 0 root = (leaf - 1)/2 if a[root] >= a[leaf]: # _2 # and _1 # and _(child) == 0 break else: a[root], a[leaf] = a[leaf], a[root] leaf = root # _2(leaf) > _2((leaf - 1)/2) # and _2 # _2 # and _2(child) == 0 # _1 # k == n # and _1 # and _1(k, n) == 0 # _3: a[(i - 1)/2] >= a[i]  i=1..k # and a[j-1] <= a[j]  j=k..n # and a[m] <= a[l]  m=0..k, l=k..n # _3(k): k - 1 while k > 0: # k > 0 # and _3 # and _3(k) > 0 # and a[0] >= a[i]  i=0..k k -= 1 a[k], a[0] = a[0], a[k] # _3  i=0 # and _3(k) > _3(k-1) root = 0 # _4: _3  i=root # _4(root, k): (_3  0)  (k-(2*root+1)) while 2*root + 1 < k: # 2*root + 1 < k # and _4 # and _4(root, k) > 0 leaf = 2*root + 1 # _4 # and leaf == 2*root + 1 if (leaf + 1) < k and a[leaf] < a[leaf + 1]: leaf += 1 # _4 # and (2*((leaf-1)/2)+2 == k # or a[leaf] == max(a[2*((leaf-1)/2)+1], a[2*((leaf-1)/2)+2])) if a[root] >= a[leaf]: # _4 # and _3 # and _4(root, k) == 0 break else: a[root], a[leaf] = a[leaf], a[root] root = leaf # _4 # and _4(root, k) > _4(2*root + 1, k) # _4 # and _4(root, k) == 0 # _3 # k == 0 # and _3 # and _3(k) == 0 # a[j-1] <= a[j]  j=0..n if __name__ == '__main__': import random a = map(lambda x: random.randint(-100, 100), range(20)) print a heapsort(a) print a 




C / C ++
 #include <cassert> #include <cstdlib> #include <ctime> #include <iostream> // C a    void sortHeap(int *a, int n) { // C a     assert(n >= 0); int k = 1; /* k == 1 and _1: a[(i - 1)/2] >= a[i]  i=1..k _1(k, n): n - k */ while(k < n) { /* k < n and _1 and _1(k, n) > 0 */ int leaf = k; ++k; /* k <= n and _1  i == leaf and _1(k, n) > _1(k+1, n) */ /* _2: _1  i == leaf _2(leaf): (_1  0)  leaf-1 */ while(leaf > 0) { /* leaf > 0 and _2 and _2(leaf) > 0 */ int root = (leaf - 1)/2; if (a[root] >= a[leaf]) /* _2 and _1 and _(child) == 0 */ break; else { int tmp = a[root]; a[root] = a[leaf]; a[leaf] = tmp; leaf = root; /* _2(leaf) > _2((leaf - 1)/2) and _2*/ } /* _2 and _2(child) == 0 */ } // _1 } /* k == n and _1 and _1(k, n) == 0 */ /* _3: a[(i - 1)/2] >= a[i]  i=1..k and a[j-1] <= a[j]  j=k..n and a[m] <= a[l]  m=0..k, l=k..n _3(k): k - 1 */ while(k > 0) { /* k > 0 and _3 and _3(k) > 0 and a[0] >= a[i]  i=0..k */ --k; int tmp = a[k]; a[k] = a[0]; a[0] = tmp; /* _3  i=0 and _3(k) > _3(k-1) */ int root = 0; /* _4: _3  i=root _4(root, k): (_3  0)  (k-(2*root+1)) */ while(2*root + 1 < k) { /* 2*root + 1 < k and _4 and _4(root, k) > 0 */ int leaf = 2*root + 1; /* _4 and leaf == 2*root + 1 */ if ((leaf + 1) < k and a[leaf] < a[leaf + 1]) leaf += 1; /* _4 and (2*((leaf-1)/2)+2 == k or a[leaf] == max(a[2*((leaf-1)/2)+1], a[2*((leaf-1)/2)+2])) */ if (a[root] >= a[leaf]) /* _4 and _3 and _4(root, k) == 0 */ break; else { int tmp = a[root]; a[root] = a[leaf]; a[leaf] = tmp; root = leaf; /* _4 and _4(root, k) > _4(2*root + 1, k) */ } /* _4 and _4(root, k) == 0 */ } // _3 } /* k == 0 and _3 and _3(k) == 0 */ // a[j-1] <= a[j]  j=0..n } int main(int argc, char *argv[]) { srand((unsigned)time(NULL)); const int n = 20; int a[n]; for(int i=0; i < 20; i++) a[i] = rand()/(RAND_MAX/200) - 100; for(int i=0; i < 20; i++) std::cout << a[i] << ' '; std::cout << std::endl; sortHeap(a, n); for(int i=0; i < 20; i++) std::cout << a[i] << ' '; std::cout << std::endl; return EXIT_SUCCESS; } 




At last



Try yourself when solving a very similar Shell sorting problem.



An array is given of n elements. The array is divided into subarrays with a step k0

{a [0], a [k0], a [2k0], ...}, {a [1], a [1 + k0], a [1 + 2k0], ...}, ..., {a [k0-1 ], a [2k0-1], a [3k0-1], ...}

and if the neighboring elements in the subarray violate the order, then they change places. Then the procedure is repeated with step k1 , etc.

The last step should be equal to 1 .



As steps you can take, for example, the following sequence

steps [16] = [1391376, 463792, 198768, 86961, 33936, 13776, 4592, 1968, 861, 336, 112, 48, 21, 7, 3, 1].



This is the original article of the author of the algorithm.

Donald Lewis Shell, A High-Speed ​​Sorting Procedure, CACM, 2 (7): 30-32, July 1959.



Literature



This is the main book in Russian

O.-J. Dahl, EW Dijkstra and CAR Hoare, Structured Programming. Academic Press, 1972. ISBN 0-12-200550-3. Translation: Dal U., Dijkstra E., Hoor K., Structured programming. M .: Mir, 1975.



This is the original article.

CAR Hoare. " An axiomatic basis for computer programming ". Communications of the ACM, 12 (10): 576–580,583 October 1969. DOI: 10.1145 / 363235.363259



S.A. Abramov Elements of programming. - M .: Science, 1982. P. 85-94.



Wikipedia Logic Hoare .



From lectures VMiK MSU on "Programming Technologies".

Source: https://habr.com/ru/post/268013/



All Articles