Recently on Habré there was a good
article about the calculation of the N-th number of fibonacci for O (log N) arithmetic operations. The sensible question that popped up in the comments was: "why can this be useful in practice." In itself, the calculation of the N-th number of fibonacci may not be very interesting, however, the matrix approach used in the article can be used in practice for a much wider range of tasks.
In this article, we will discuss how to write an interpreter that can perform simple operations (assignment, addition, subtraction, and truncated multiplication) on a limited number of variables with nested loops with an arbitrary number of iterations in a fraction of a second (of course, if the intermediate values in the calculations remain in reasonable limits). For example, this is the code given as input to the interpreter:
loop 1000000000 loop 1000000000 loop 1000000000 a += 1 b += a end end end end
')
Immediately displays a = 10000000000000000000000000, b = 5000000000000000000000000000000000000000000000000, despite the fact that if the program were performed naively, the interpreter would have to perform an octillion operations.
I believe that the reader has an idea of what a matrix is and what a product of matrices is. In this article, we will use exclusively square matrices and rely on a very important property of multiplying square matrices — associativity.
For simplicity, we will limit our interpreter to four variables - A, B, C and D. To represent the state of the interpreter at a given moment we will use a vector of size five, the first four elements of which will contain the values of four variables, respectively, and the last will be equal to one throughout the interpreter’s work .
(A, B, C, D, 1)
At the beginning of the interpreter, we will set the values of all variables to zero.
(0, 0, 0, 0, 1)
Suppose that the first operation in the program code contains the string
A += 5
The effect of this command is that the value of variable A will increase by five, while the values of the other three variables will not change. This can be represented as the following matrix:
1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 5 0 0 0 1
If you look at it, you can see that it is almost identical to the identity matrix (which, as you know, when multiplying any vector by it, does not change its value), except for the last element in the first column, which is five. If we recall how the vector is multiplied by the matrix, we can understand that the values of all elements except the first one will not change, while the value of the first element will be equal to
v[0] * 1 + v[4] * 5
Since v [0] contains the current value in the variable A, and v [4] is always equal to one,
v[0] * 1 + v[4] * 5 = A + 5
If the vector of the current state is multiplied by this matrix, the resulting vector will correspond to the state in which A is five more, as required.
If the matrix is changed a little, removing one in the first element of the first row:
0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 5 0 0 0 1
As before, the values of all elements except the first one will not change, while the first element will become equal to v [4] * 5, or just five. Multiplying the current state vector by such a matrix is equivalent to executing the command
A = 5
Let's look at this matrix:
1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1
Its only difference from the unit matrix is the second element in the fourth row, which is equal to one. It is obvious that multiplying the current state vector by this matrix will not change the values in the first and last three elements, while the value of the second element will change to
v[1] * 1 + v[3] * 1
Since v [1] contains the current value of the variable B, and v [3] contains the current value of the variable D, multiplying the state vector by such a matrix is equivalent to executing the command B + = D
Similarly, arguing, it can be understood that multiplying the state vector by the next matrix is equivalent to executing the command C * = 7
1 0 0 0 0 0 1 0 0 0 0 0 7 0 0 0 0 0 1 0 0 0 0 0 1
Let's move on to combining commands. Let the vector v set the current state, the matrix Ma corresponds to the command A + = 5, and the matrix Mm correspond to the command A * = 7. Then, to get the vector r for the state after executing these two commands, you must first multiply v by Ma, and then Mm:
r = v * Ma * Mm
As expected, multiplying the initial state vector by these two matrices leads to a state in which a = 35:
1 0 0 0 0 7 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 5 0 0 0 1 0 0 0 0 1 0 0 0 0 1 5 0 0 0 1 35 0 0 0 1
Consider another example - let, instead of multiplying by seven, we just want to add five to A seven times.
r = v * Ma * Ma * Ma * Ma * Ma * Ma * Ma
Here comes the associative multiplication property of matrices:
r = v * Ma * Ma * Ma * Ma * Ma * Ma * Ma = v * (Ma * Ma * Ma * Ma * Ma * Ma * Ma) = v * Ma ^ 7
This is an example of a simple cycle - instead of multiplying v by Ma seven times, it is enough to raise the matrix Ma to the seventh power, after which you can perform only one multiplication. If we wanted to perform the following two operations in a loop three times:
A += 5 B -= C
(Let the operation B - = C be represented by the matrix Mbc), it would look like this:
r = v * Ma * Mbc * Ma * Mbc * Ma * Mbc = v * ((Ma * Mbc) * (Ma * Mbc) * (Ma * Mbc)) = v * (Ma * Mbc) ^ 3
We calculate the matrix corresponding to the body of the cycle only once, after which we raise it to a power.
The examples considered are enough to start working on a simple language interpreter that supports assignment, addition, subtraction, multiplication (by a constant only) and cycles. To do this, we will learn to represent any such program as an N + 1 matrix by N + 1, where N is the number of variables with which the program operates, after which we simply multiply the vector with the initial state by this matrix.
The rules for presenting a program as a matrix are very simple:
1. Each individual command is represented as a matrix, which differs from the single one element (or two for the assignment operation). Examples of such matrices are discussed above in this article.
2. Several consecutive teams are presented in the form of a matrix equal to the product of the matrix representation of each individual team.
3. The cycle is represented as a matrix representing the cycle body, raised to the power of the number of loop iterations.
If we have an identity function that returns a single matrix:
def identity(): return [[1 if i == j else 0 for j in range(REGS + 1)] for i in range(REGS + 1)]
That function that builds the matrix for the command r1 + = r2 (where r1 and r2 are variables) may look like this:
def addreg(r1, r2): ret = identity() ret[r2][r1] = 1 return ret
And for the command r + = val (r is a variable, val is a constant) like this:
def addval(r, val): ret = identity() ret[REGS][r] = val return ret
The functions for constructing matrices of other commands look similar - a unit matrix is obtained, in which one element is replaced.
The interpreter without cycles is now written very simply - let the mat matrix correspond to the code already read. At the beginning it is equal to the identity matrix, because an empty program does not change the state. Then we read the instructions one by one, divide them into three elements (left operand, operator, right operand), and depending on the operator, multiply the matrix corresponding to the whole program by the matrix corresponding to the current command:
def doit(): mat = identity() while True: line = sys.stdin.readline().lower() tokens = line.split() if tokens[0] == 'loop': # elif tokens[0] == 'end': return mat else: r1 = reg_names.index(tokens[0]) try: r2 = reg_names.index(tokens[2]) except: r2 = -1 if tokens[1] == '+=': if r2 == -1: cur = addval(r1, long(tokens[2])) else: cur = addreg(r1, r2) elif tokens[1] == '-=': .... mat = matmul(mat, cur)
It remains the case for small - add support for cycles. The loop raises the matrix of the loop body to the extent of the number of iterations of the loop. Exponentiation, as is known, requires only O (log N) operations, where N is the degree to which the matrix is raised. The exponentiation algorithm is very simple:
1. If the degree is zero, return the identity matrix.
2. If the degree is even, let be 2N, then we can recursively calculate M ^ N, and then return the square of the resulting matrix.
3. If the degree is odd, let 2N + 1, then it is sufficient to recursively calculate the value M ^ 2N, and return the resulting matrix multiplied by M.
Since every two iterations the degree is reduced by two, the complexity of this algorithm is logarithmic.
def matpow(m, p): if p == 0: return identity() elif p % 2 == 0: tmp = matpow(m, p / 2) return matmul(tmp, tmp) else: return matmul(m, matpow(m, p - 1))
In the interpreter, it now remains to add one line:
... if tokens[0] == 'loop': cur = matpow(doit(), long(tokens[1])) ...
And the interpreter is ready.
An example interpreter is available on
github . All code takes less than 100 lines.
For the speed test, you can return to the already mentioned fibonacci numbers. For example, the following code:
A = 1 B = 1 loop 100 C = A C += B A = B B = C end end
Calculate the 101st and 102nd fibonacci numbers:
A = 573147844013817084101, B = 927372692193078999176
Replacing 100 with 1,000,000 will calculate the million first and million second numbers in four seconds. Running such a program in the forehead would take much more, because the program has to operate on many thousands of digits. If you write a code that does not have to operate with large numbers, for example, the code for calculating the sum of an arithmetic progression given at the beginning of the article, the number of iterations may go beyond the reasonable, but the code will be executed in a fraction of a second
loop 1000000000000000000000000000000000000000000000 loop 1000000000000000000000000000000000000000000000 loop 1000000000000000000000000000000000000000000000 a += 1 b += a end end end end
In practice, this approach can be used, for example, in optimizing compilers, which can thus fold cycles with a large number of iterations, operating on a small number of variables.