Search for periodic security elements of the RF Passport using the Fourier transform: part two

Many documents contain security features such as holograms, watermarks, guilloches, etc. In the process of scanning such documents there is a problem - the security elements interfere with the recognition systems (OCR). In developing Smart PassportReader, we conducted a study aimed at finding and eliminating similar security elements from document images.

In our previous article on this topic, we talked about the first half of the search problem solution — detection, i.e. determining the presence of periodic elements in the image. Today we will tell how to find the immediate position of the periodic elements in the image, provided that the detection was successful: we are sure that the elements in the image are present. The second part depends heavily on the first, so it is strongly recommended that you first get acquainted with the first, if you have not already done so.
')
Like last time, the Fourier transform will be used for this.

Image model

Like last time, many arguments will be shown for the one-dimensional case, which is then easily transferred to the two-dimensional one.

Recall that $I (x)$ - original image length $N$ consisting of two others: background image $h (x)$ and images $g (x)$ containing a periodic pattern. Picture $g (x)$ in turn, we presented a single instance of the template convolution $f (x)$ with Dirac's crest $c (x)$ :

$I (x) = h (x) + g (x) = h (x) + f (x) \ ast c (x)$

Since we know in advance the structure of the periodic pattern (in the case of the RF passport, the size and period of the holograms), to restore the position of the elements, it is enough to know only the horizontal and vertical shift of the entire template, which is the same as the shift of one of the “eagles”.

Discrete Fourier transform of a shifted periodic pattern

Shift signal pattern $g (x)$ equivalent to impulse shift $c (x)$ which leads to a modified form $\ mathcal {F} c (x)$ . The spectrum of the Fourier transform of the shifted signal changes only in phase, leaving the amplitude unchanged, thereby allowing the periodic patterns to be detected independently of their initial shift. As an important consequence, all the useful phase information is concentrated in the same DFT positions (frequencies) as the amplitudes of the pulsed signal.

Let's pretend that $c (x)$ shifted to the right by $S$ , then:

$\ mathcal {F} _k c (x - S) = \ mathcal {F} _k c (x) \ cdot e ^ {i \ Phi_k},$

$\ Phi_k = \ Phi \ cdot k, \ Phi = - \ frac {2 \ pi} {N} S$

Since the phase angle $\ mathcal {F} _k c (x)$ for unbiased $c (x)$ equals zero for offset $c (x-s)$ he becomes equal $\ Phi_k$ . To obtain a phase shift $\ mathcal {F} _k g (x)$ we need to add each member $\ Phi_k$ to the corresponding phase angle in $\ mathcal {F} _k f (x)$ for each $k$ . Note that any phase angle is always taken modulo $2 \ pi$ and is limited by the interval $[- \ pi; \ pi)$ .

Phase shift on $2 \ pi$ in the frequency domain represents the shift $N$ in the time domain but $c (x)$ has a period $T = \ frac {N} {M}$ , so we can avoid redundancy by considering only ${s = S \ text {mod} T}$ thus ensuring $0 \ le s & lt; T$ . For convenience, let $\ phi$ Is a phase shift in which $2 \ pi$ corresponds to the time shift by $s$ and also let $\ phi_m$ Is the phase angle on $m$ Amplitude peak:

$\ phi_m = \ phi \ cdot m, \ phi = - \ frac {2 \ pi} {T} s = \ Phi \ cdot M$

The figure shows an example of the DFT phase of a shifted pulsed signal with values ${N = 256}$ period $T = 32$ and shift $s = 7$ . For example, $\ phi_0 = 0$ and $\ phi_6 = (- \ frac {2 \ pi} {32} \ cdot 7) \ cdot 6 \ approx -1.963$ .

And this is how the amplitude and phase of the DFT looks like for a Gaussian grid like a chessboard:

- DFT ->

For a two-dimensional image, the phase shift $\ phi$ now consists of two components $(\ phi_x, \ phi_y)$ . We introduce a coordinate grid for the DFT of a chess-like system of peaks shown in the second and third figures above, such that the central peak has coordinates $(0, 0)$ peaks right above $(20)$ , $(0, 2)$ and so on; the nearest diagonal peak has coordinates $(eleven)$ . In general, the peak $(i, j)$ present if and only if $(i + j)$ - even.

Applying again the theorem on the DFT of a shifted signal, one can extend the shift equation and show that the phase angle $\ phi_ {i, j}$ on $(i, j)$ The peak for a two-dimensional pulse signal is:

$\ phi_ {i, j} = i \ cdot \ phi_x + j \ cdot \ phi_y$

As for the one-dimensional case, to obtain a shifted phase $\ mathcal {F} _ {i, j} g (x)$ need to add $\ phi_ {i, j}$ to the appropriate phase $\ mathcal {F} _ {i, j} f (x)$ .

Search for a periodic template

To determine the exact location of the periodic pattern, it is sufficient to estimate its phase shift. $\ phi = (\ phi_x, \ phi_y)$ since its pixel periodic structure is known in advance. Pixel shift $s = (s_x, s_y)$ , subsequently, it is not difficult to recover from the information on the phase shift.

Cutting, scaling and image preprocessing

In the first article, we described the different stages of preprocessing the original image of a passport. First, an area containing a 2x2 grid (similar to a chessboard) and containing 2 patterns horizontally and vertically is cut out of the image. This can be done (with an error of less than 5%), since the physical dimensions of the document, the size and period of the periodic patterns are fixed and known in advance. The position of the region can be set by combining the physical coordinates of the document with its coordinates in the image.

Next, a significant compression and smoothing of the image is performed to suppress the background and to level the differences between the periodic patterns. The figure below shows an image of a cut-out region of a Russian passport, the results of its preprocessing and the amplitude of its DFT.

- processing ->

- DFT ->

As expected, the DFT amplitude has noticeable peaks with a distribution similar to a chessboard.

Spectrum preprocessing

According to the image model, the original image consists of three independent signals: the background image $h (x)$ and single sample periodic pattern $f (x)$ that rolls up with a shifted pulse signal $c (x)$ to get a periodic pattern $g (x)$ :

$I (x) = h (x) + f (x) * c (x)$

After performing the DFT on $I (x)$ , we have:

$\ label {eq: immodel_dft_base} \ mathcal {F} I (x) = \ mathcal {F} h (x) + \ mathcal {F} f (x) \ cdot \ mathcal {F} c (x)$

To obtain a phase shift $\ phi$ stored in $c (x)$ from $\ mathcal {F} I (x)$ , which we calculated for this image, we need to gradually suppress the other components of the equation.

Background Spectrum Suppression

If the background $h (x)$ after preprocessing is fairly homogeneous on the documents being processed, simple calculation of the averaged spectrum is possible $\ mathcal {F} h (x)$ on images of documents on which there is no desired periodic pattern, with its subsequent subtraction from $\ mathcal {F} I (x)$ .

However, the background image $h (x)$ in our model, in fact, it is not a background in terms of the structure of a Russian passport: it contains personal data, which, by definition, differ on the set to be processed.

We have previously mentioned the fact that the DFT positions (frequencies) without peaks do not contain useful information regarding periodic patterns, since they represent the background. Assuming that $\ mathcal {F} h (x)$ is smooth, we can interpolate its contribution to each peak $\ mathcal {F} I (x, y)$ based on values adjacent to peak positions. Average value $\ overline {\ mathcal {F} I (x ', y')}$ by $(x, y)$ on the nearest neighbors peak $\ {\ mathcal {F} I (x ', y') \}$ 3x3 window is an acceptable estimate $\ mathcal {F} h (x, y)$ to subtract from $\ mathcal {F} I (x, y)$ To clear out the background:

$\ mathcal {F} I (x, y) \ leftarrow \ mathcal {F} I (x, y) - \ overline {\ mathcal {F} I (x ', y')}$

The following figure contains a compressed image and an image obtained as a result of the inverse DFT after subtracting the interpolated background spectrum and resetting the DFT at all off-peak frequencies.

- DFT, subtraction of the background spectrum, inverse DFT ---->

As expected, only the periodic pattern remained on a plain monotone background, and instances of the periodic pattern became more similar to each other. Note that the background is not black because the DFT value on $(0, 0)$ The value averaged over the image has not been reset.

Suppressing the spectrum of pattern instances

After removing the background spectrum, we have left $\ mathcal {F} f (x) \ cdot \ mathcal {F} c (x)$ . When multiplying two complex numbers, their phases are added. Then, to reach the phase angle $\ mathcal {F} c (i, j)$ at peak $(i, j)$ corresponding to the phase angle of the spectrum of a single instance of a periodic pattern $\ mathcal {F} f (i, j)$ must be deducted.

The problem is that the spectrum of the periodic pattern is usually unknown. One of the possible solutions is the experimental calculation of the phase of the presented pattern of the pattern and its subsequent subtraction. In our experiments, we used a different path, assuming that the phase contribution $\ mathcal {F} f (i, j)$ constant means to resultant shift $\ phi$ one more constant shift is added. This shift constant can be estimated experimentally as a systematic error on a test dataset.

Phase Shift Recovery

At the end of the spectrum preprocessing, we have the phase angle of the pulsed signal $c (x)$ which presumably contains all the necessary information about the phase shift ${\ phi = (\ phi_x, \ phi_y) ^ T}$ .
We can build a system of equations (each equation corresponds to $(i, j)$ peak) modulo $2 \ pi$ :

$\ begin {cases} \ dots \\ i \ cdot \ phi_x + j \ cdot \ phi_y = \ phi_ {i, j} \ pmod {2 \ pi} \\ \ dots \\ \ end {cases}$

Or, in matrix form:

$A \ phi = b \ pmod {2 \ pi}$

The left part of the system contains the “ideal” phase values corresponding to the peaks, and the right part contains the actual value of the DFT phase for this image at the peak positions.

This system is redefined, because she has $n$ equations (by the number of peaks considered) and only 2 variables: $\ phi_x$ and $\ phi_y$ . Overdetermined systems of equations can be approximately solved by the least squares (OLS) method by finding a solution that minimizes the sum of squared residuals. However, this system is a modular system. $2 \ pi$ therefore, MNCs cannot be directly applied, but we can calculate the initial solution and correct it with a sequential solution of the system.

Calculation of the initial solution

Let's look at two equations for peaks numbered $(20)$ and $(0, 2)$ :

$\ begin {cases} 2 \ cdot \ phi_x + 0 \ cdot \ phi_y = \ phi_ {2, 0} \ pmod {2 \ pi} \\ 0 \ cdot \ phi_x + 2 \ cdot \ phi_y = \ phi_ {0, 2} \ pmod {2 \ pi} \ end {cases}$

This system has 4 valid solutions: $(\ frac {\ phi_ {2, 0}} {2}, \ frac {\ phi_ {0, 2}} {2})$ , $(\ frac {\ phi_ {2, 0}} {2} + \ pi, \ frac {\ phi_ {0, 2}} {2})$ , $(\ frac {\ phi_ {2, 0}} {2}, \ frac {\ phi_ {0, 2}} {2} + \ pi)$ and $(\ frac {\ phi_ {2, 0}} {2} + \ pi, \ frac {\ phi_ {0, 2}} {2} + \ pi)$ . Because of the chessboard-like structure of peak patterns in our case, and also because $2 \ pi$ represents a shift by a unit period, the first with the fourth solution are identical, as well as the second with the third. Two potential solutions remain: $\ phi ^ * _ 1 = (\ frac {\ phi_ {2, 0}} {2}, \ frac {\ phi_ {0, 2}} {2}) ^ T$ and $\ phi ^ * _ 2 = (\ frac {\ phi_ {2, 0}} {2} + \ pi, \ frac {\ phi_ {0, 2}} {2}) ^ T$ .

To choose the right solution, we can add to the system the two remaining equations for the peaks $(eleven)$ and $(eleven)$ by making the system redefined again. Then, the value of the residual $\ delta_k$ calculated for $k$ th decision $\ phi ^ * _ k$ with all $n = 4$ equations:

$\ delta_k = \ frac {1} {n} \ sum_ {i, j} {\ left (\ begin {pmatrix} i & amp; j \ end {pmatrix} \ cdot \ phi ^ * _ k - \ phi_ {i, j } \ right) ^ 2} = \ frac {1} {n} | A \ phi ^ * _ k-b | ^ 2$

Note that intermediate calculations involving phase angles (difference calculation) are performed in $[- \ pi; \ pi)$ .

Finally, the solution $\ phi ^ * _ k$ the lowest value $\ delta_k$ selected as $\ phi ^ *$ Because the other solution will have much larger residual values. Great importance $\ delta_k$ for $\ phi ^ * _ k$ may mean false peak detection.

Solution adjustment

The initial solution found above could be good if the remaining spectrum contained only information about $c (x)$ and nothing more. Unfortunately, it is distorted by spectral components from the background and from samples of periodic patterns that were not completely removed during the preprocessing.

At this stage, the selected solution $\ phi ^ *$ is still true, but can be refined using phase information at different peaks, and not just for the first two. Subtract $A \ phi ^ *$ (using $n = 4$ equations) from both sides of the system of equations, thus moving the solution space along $\ phi ^ *$ and getting the following overdetermined system of equations for the corrective solution $\ theta$ :

$\ label {eq: correction_phase} A \ theta = b - A \ phi ^ *$

Assuming that, in the first approximation of the solution, we were no more mistaken than $\ pi$ , we can not take the system modulo $2 \ pi$ because after shifting the system, every equation will be inside $2 \ pi$ , which means that we can approximately solve the system using the least squares method. After deciding $\ theta ^ *$ found, it should be added to the previously found initial solution $\ phi ^ *$ as a correction value.

So far, we have considered equations for the first 4 peaks. $(i, j)$ having $(| i | + | j |)$ an amount of 2 (and $i \ ge 0$ because of the symmetry of the DFT), but there are other peaks with sums equal to 4, 6, and so on. The reason why they were not added earlier is that when you increase $(| i | + | j |)$ the number of suitable system solutions is also increasing, and the closer they become to each other modulo $2 \ pi$ .

To use the information about the phase shift contained in the remaining peaks, we can repeat the above process of adjusting the initial solution, replacing $\ phi ^ *$ the adjusted solution obtained for an amount equal to 4 and using all equations having an amount equal to 6 and so on, until the required accuracy is achieved. Assuming that the error decreases with increasing iteration, each equation will be within $2 \ pi$ That allows you to use OLS to solve the system.

Restoring the location of periodic patterns

Now having a phase shift $\ phi = (\ phi_x, \ phi_y)$ , we can easily convert it back to pixel shift $s = (s_x, s_y)$ . Knowing the period and size of the patterns, we can restore the position of each grid pattern. The size and position of the excised region that we used in the process are also known, so we can get the position of the patterns on the original document image.

The following picture, which you have already seen at the very beginning of the article, presents an example of finding a holographic periodic pattern on a Russian passport.

A box with a circle inside indicates the result found after solving the system of equations. The other boxes indicate the extrapolated positions of the other patterns.

Experimental Results

We used a data set of 484 images of scanned Russian passports with holographic content. For each image in the set, we prepared the true value of the pixel shift of its periodic pattern. The table below contains the experimental results for this data set with the average value of the bias residual as a measure of accuracy.
The error is given as a percentage of the side of a single pattern. The preprocessing consisted of compression to size 56x58 and morphological closure operation; when solving the system for obtaining the phase shift were considered $n = 4$ peak equations.

Background suppression	Adjustment with OLS	x error	y mistake	General error
Missing	Missing	2.13	6.40	6.75
Present	Missing	2.09	6.04	6.40
Missing	Present	2.19	5.05	5.50
Present	Present	2.20	4.77	5.25

The following figure shows the distribution $x$ and $y$ errors for the worst (left) and best (right) method. Black cells represent error by $x$ , and white - by $y$ .

In these histograms, the Y axis indicates how many percent of the images (out of 484) have an error close to the corresponding value along the X axis (as a percentage of the size of the side of the periodic pattern). It can be seen that the distribution $y$ errors for a better method are less scattered than for a worse one.

Conclusion

We have proposed a method for searching for periodic patterns on document images, which uses a discrete Fourier transform and showed an average search error in $6 \%$ (from the side of the periodic element) on a set of 484 images.
We will describe the details of our application of this method in one of the following articles.

Source: https://habr.com/ru/post/266917/

All Articles