Everything written below has a
little background . Here we decided to focus on technical details. So, our inquisitive mind has the following task:
- there is a piece of paper that needs to be folded in a certain way;
- for tips, folding, it is desirable to use the technology of augmented reality and impose information about tips on top of the resulting images.
It was decided to place special barcodes and a drawing module to return information about their content and position on the image as markers on the sheet, while the module itself stores information about when and how to use this information.
One of the unnamed, but supposed, conditions was the use of “one’s own” recognition module and one’s “own” bar codes (no, it’s not necessary to reinvent the wheel, it was possible to assemble it from the parts already available - the main thing is for it to go). The developed barcodes are based on encryption with the
Reed-Solomon algorithm and, perhaps, they should devote a separate article.
Now we will lead the story about the module of recognition of these mysterious signs.
The very first step is getting the input data. We can receive them from several sources: static image, video or video from a webcam. In any case, the work of augmented reality is reduced to processing a still image (in the case of video, this will be a frame; it also adds the ability to take into account data collected from previous frames, the introduction of “inertia” data in the case of an unstable picture, etc.). Without loss of generality, consider working with static pictures (in other words, photos):
')


The photos differ in brightness (this will be important at one of the stages), but they are united by one - a sheet, with barcodes located on it. Barcodes were developed specifically and have a number of features that are also taken into account in the recognition algorithm (we will talk about them later).
First of all, you need to "discolor" the image. Color for us does not carry any information, so remove this extra information:


On the resulting images we find the boundaries of objects. To do this, for each pixel, choose the maximum difference from the pairwise surrounding values:

As a result, the value will be close to zero in “monotonous” areas (if there is noise in the image, the value will slightly differ from zero), but as soon as a pixel contrasts with the rest, the value will drastically increase. The images obtained after this operation clearly show what information becomes insignificant for us (do not forget that we have black-and-white barcodes printed on white paper, which means they will always have high contrast):


On the resulting images, it would seem, you can already search for the boundaries of objects. But do not forget that the borders here are not one color, but contain several shades of gray. Therefore, we first conduct threshold binarization. Remember, the brightness of the images was somehow important to us? So, it is important at this stage. If we set a strictly binarization threshold, the filter will work incorrectly (not in the way we would like) in very light and very dark photos. Therefore, we will use the Otsu algorithm (Otsu Tresholding), which determines the binarization threshold depending on the image itself. Therefore now important borders for us will not be erased:


Now the image contains only black and white pixels, so you can start searching for objects.
Border selection algorithms are very many. We will use one of the most simple - “sequential scanning algorithm”. Line by line we will look at the image, skipping the background pixels, but combining the “significant” pixels that go in a row (for each pixel, at the time of analysis, the left, left-top, top and right-top pixels are important). After reindexing, combining borders and other necessary manipulations, about which you can read more in the relevant articles, we will finally get a list of related areas. Only a little remains - choose the ones that are barcodes.
To do this, we will carry out several checks (we will work again with the image in shades of gray):
- We discard those of the found objects that are smaller than the size we specified: since we have barcodes containing 8 Ă— 8 cells with information, then we take the minimum image size with them, for example, 16 by 16 pixels.
- Let us analyze the boundaries of each of the objects. Replacing them with a set of segments, we leave only those whose borders consist of 4 segments (square barcodes, however, due to the fact that the sheet is not perpendicular to the camera, they can be distorted, but 4 sides will remain the same)
- One of the features of our barcodes is a continuous black border, therefore we leave only those objects whose inner border is darker than the outer one.
- Restore the square shape of the remaining objects. To do this, we build the matrix for displaying the original (distorted) image on a square image, and conflicts (lack or excess of pixels) are resolved by interpolating the data. For the images from the example, we obtain the following barcode images:


- Otsu filter is again applied to the received images:


We binarize the obtained images and transform them into a numerical matrix. Barcodes consist of 64 cells (8 × 8), so we divide the images into 64 square areas. For each cell, we calculate its “exact” value: it is black or white (if “accuracy” is not enough, for example, 40% of pixels of the same color and 60% of the other, then the image is considered unsuitable for further analysis). According to the results we build an 8 × 8 matrix, filled with 1 and 0.
- We analyze the resulting matrices: we look for angular markers, rotate the matrix in accordance with our concepts of top and bottom (the barcode images above are already rotated in the specified manner), extract useful information from it, decode it (the information is encoded using Reed-Solomon algorithm). At this stage, as it was already possible to guess, the areas that passed all past checks are also eliminated, they are not barcodes. But the information obtained can already be used.
______________________
The text was prepared in the Habr Editor from © SoftCoder.ru