Introduction
Typically, this material is given with an abundance of formulas and is designed more for mathematicians. I will try to paint it most easily on simple numerical examples in terms of the application of this method in microelectronics at the hardware level. In numerical examples, for clarity, the value p = 11 will be used.
Formulation of the problem
Suppose that we need to perform the multiplication of the following form:
res = (a*b) mod p
, where
0 <= a < p
0 <= b < p
p
is a prime number.
mod p
is the operation of finding the modulo residue.
And it should be done at a low level, where there is no such multiplication operation and the operation of taking the remainder of the division or they are implemented quite difficult (for example, in an electronic device).
Simple solutions
- The first thing that comes to mind: multiply, then divide and take the remainder. This approach has the right to exist, but it is extremely costly in the number of operations and rather difficult to implement.
- The second thing you can think of is to implement this operation by a two-dimensional multiplication table of size
p
by p
. What makes sense if p
small, but as p grows, the cost of storing the table quadratically increases (Fig. 1).
Figure 1. The multiplication table modulo p for p = 11.The multiplier based on the index method
However, there is a method that requires one (or for the convenience of two) tables of dimension
p
. The method is based on the replacement of multiplication by addition. And it can be schematically illustrated by the following figure (fig. 2):
Figure 2. Index multiplication.')
We explain why this is possible. The index representation of a number is based on the concept of a primitive root modulo a simple
p
[1]. The primitive root w is an integer, the construction of which to a power of
0, 1, 2, …, (p-2)
gives non-repeating deductions modulo
p
. A primitive root always exists for any simple
p
(proved by Gauss in 1801). In this case, each integer
q
from the interval
(0; p)
can be associated with a number
i
such that:
q = (w
i ) mod p
. And thus get the following match:
(a*b) mod p <-> w^((ia+ib) mod (p-1))
[2].
Consider an example for the module
p = 11
. The primitive root
w
for this modulus value is 2. As it is easy to make sure that
w
raised to a power of 0, 1, ... 9, it gives non-repetitive results:
- (2 0 )
mod
11 = 1 mod
11 = 1 - (2 1 )
mod
11 = 2 mod
11 = 2 - (2 2 )
mod
11 = 4 mod
11 = 4 - (2 3 )
mod
11 = 8 mod
11 = 8 - (2 4 )
mod
11 = 16 mod
11 = 5 - (2 5 )
mod
11 = 32 mod
11 = 10 - (2 6 )
mod
11 = 64 mod
11 = 9 - (2 7 )
mod
11 = 128 mod
11 = 7 - (2 8 )
mod
11 = 256 mod
11 = 3 - (2 9 )
mod
11 = 512 mod
11 = 6
To get the conversion table between the normal
{q}
and index
{i}
representation, you must sort the resulting value pairs in ascending order. Thus, the direct conversion table for the
p
= 11 module will look as follows:
q | one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten |
i | 0 | one | eight | 2 | four | 9 | 7 | 3 | 6 | five |
And the inverse transformation table for the
p
= 11 module will look like this:
i | 0 | one | 2 | 3 | four | five | 6 | 7 | eight | 9 |
q | one | 2 | four | eight | five | ten | 9 | 7 | 3 | 6 |
Find the value of the expression (3 * 5)
mod
11. The numbers 3 and 5 have the corresponding indices 8 and 4 (see table 1). Summing these indices modulo (11-1) = 10, we get the result (8 + 4)
mod
10 = 12
mod
10 = 2. From table 2 we find that the inverse transformation for index 2 gives the final result equal to 4.
The structural diagram of the index multiplier modulo m = 11 for the considered example can be viewed in the following figure (Fig. 3):
Figure 3. Scheme of the index multiplier for p = 11.Zero values for inputs
If you look closely at the tables, you can see that there are no zero values for the input data. This is due to the fact that
w
i ! = 0 for no value of
i
. This case is processed separately (or the concept of a singularity is introduced with special rules for its processing). If 0 appears on one of the multiplier inputs, then the output will also be 0, which follows directly from the multiplication rules.
Paralleling multiplier
It turns out multiplication can be done even faster. If the number
(p-1)
can be divided into mutually simple factors
p-1 = m1*m2*…*mr
, then the addition operation can be divided into
r
addition operations of a smaller dimension. In this case, the index is converted into a vector of length
r
, according to the following formula: (
i mod m1, i mod m2, …, i mod mr
). And summation is performed independently for each element of the vector.
Consider this by example. For
p
= 11, the value of
p
-1 = 10 and it can be split into mutually simple factors uniquely: 10 = 2 * 5 (
m1
= 2,
m2
= 5). In this case, table 1 may be listed as follows:
q | one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten |
i | 0 | one | eight | 2 | four | 9 | 7 | 3 | 6 | five |
(i mod 2, i mod 5) | (0, 0) | (eleven) | (0, 3) | (0, 2) | (0, 4) | (14) | (12) | (13) | (0, 1) | (ten) |
And the reverse conversion table, respectively, as follows:
(i mod 2, i mod 5) | (0, 0) | (0, 1) | (0, 2) | (0, 3) | (0, 4) | (ten) | (eleven) | (12) | (13) | (14) |
q | one | 9 | four | 3 | five | ten | 2 | 7 | eight | 6 |
We will find, as in the previous example, the value (3 * 5)
mod
11. First, we look for the corresponding vectors in the table: 3 -> (0, 3), 5 -> (0, 4). Then we add elementwise ((0 + 0)
mod
2, (3 + 4)
mod
5) = (0, 2). From the inverse transformation table, we find the answer: (0, 2) -> 4. The parallel multiplier circuit is shown below (Fig. 4):
Figure 4. Scheme of parallel index multiplier for p = 11.How to search for a primitive root?
If you honestly did not ask this question. I use brute force starting from 2 to p. Or, you can use a ready-made sequence:
oeis.org/A046145 . If there is a more effective method, write in the comments.
How to design a modulo adder (p-1)?
Due to the peculiarities of the input data of the modulo adder
(p-1)
, namely, that both inputs come in a number less than
p-1
, which means their sum is less than
2*(p-1)
. From this it follows that you can use any of the standard adders, the output of which is adjusted according to the following algorithm: if the value is greater than or equal to
(p-1)
, then subtract from the result
(p-1)
, otherwise leave it unchanged.
Verilog generator
At leisure, I wrote an online Verilog generator to implement an index multiplier modulo. A drawing of his work is also drawn there.
Verilog for multiplication modulo 11Verilog for multiplication modulo 31Verilog generator for any number up to 1000Literature
[1]
en.wikipedia.org/wiki/%D0%9F%D0%B5%D1%80%D0%B2%D0%BE%D0%BE%D0%B1%D1%80%D0B0%D0% B7% D0% BD% D1% 8B% D0% B9_% D0% BA% D0% BE% D1% 80% D0% B5% D0% BD% D1% 8C_% 28% D1% 82% D0% B5% D0% BE% D1% 80% D0% B8% D1% 8F_% D1% 87% D0% B8% D1% 81% D0% B5% D0% BB% 29[2]
www.researchgate.net/publication/224735018_A_fast_RNS_Galois_field_multiplierFrom the author
If you have any additions to the article I will be glad to see them in the comments. And another article turned out to be poor in reference with the details, if anyone has something to throw. =)