📜 ⬆️ ⬇️

We supplement reality: survey material


Author: Igor Litvinenko, Senior Mobile Developer.

Everyone probably heard about the VR helmets, creating the effect of presence in the virtual world. However, today I would like to talk not about virtual, but about augmented reality. These concepts are important to distinguish. In virtual reality helmets, the entire image is generated - such a reality is completely artificial. Augmented reality, in contrast to virtual reality, does not imply the creation of a completely artificial reality, but the addition of a video stream of our reality with virtual objects and data. Thus, there is a combination of the virtual and real world.

Core Augmented Reality Technologies



How is augmented reality created? To make the addition of a certain real object, you need to detect this real object in the video stream. This is the most important thing - after the discovery of the object, it is not difficult to finish something and somehow supplement it. There are different ways to detect the necessary objects, mainly for this purpose augmented reality markers are used. The following evolutionary sequence lists the main ways to detect complemented objects:
')


Marker Picture



The simplest marker of augmented reality can be easily recognized by the thick black frame. Such an object is very easy to detect in a video stream:





Markerless



Despite the name, with markerless approach, the marker is, in fact, still there. Just here it does not look like a marker, but as a picture.





Marker combination



This technology allows us to take into account the simple form of three-dimensional objects: a cube, a cylinder, etc. Here we can create a configuration that helps to understand what kind of object is in front of us - so, a certain shape object with a certain color combination can serve as a marker (for example, made an application that defines the drug by packaging and label). We also made an application that recognizes brands of wines - the library could find labels from different angles, which does not work in markerless or simplest technology due to nonlinear marker transformation.

Frame marker



Let's say you are holding a conference. You have a logo that you hang on the walls to show people where to go. The logo is one, so all the images are the same; at the same time you need to uniquely identify each picture. How to do it? With frame marker. When using frame marker, the ID of the image is encrypted in the frame:





Location Based Augmented Reality



If you walk around the city and get information about the buildings that you see, most likely, the addition of reality occurs through location.



In this case, there is no image recognition task. This technology is based on the use of the GPS receiver, compass and accelerometer present in the mobile device. Thanks to them, we know in which direction we are looking. Thus, to supplement reality, you just need to correctly respond to the readings of the sensors of the mobile device. This task is not so difficult - there are enough libraries that cope well with it.

Real Augmented Reality



There are no markers here. Here we are on the move to determine the 3D forms and characteristics of any objects that fall into the camera lens. We need to know the depth of the object to turn the 2D image into 3D. For this, you can use, for example, the above-mentioned SLAM algorithm, which searches for characteristic points on surrounding objects. So far, on mobile devices it all works very slowly. Now the technology of this augmented reality actively introduces Sony in conjunction with the PlayStation.

Keyshare - keyshare.org



And now I’ll tell you how we in DataArt wrote our augmented reality engine and why we did it.

One Swiss startup decided to offer a new system to increase sales, built on the use of augmented reality technology, and we developed this system for it. Here is how it works.

We have a patented marker of augmented reality in the form of a key image that can be placed, for example, in a magazine next to a description of a product. White dots of different sizes inside this key allow you to uniquely identify the content. There is a server - it accepts the code read from the key and returns to the user a variety of data about the product, its 3D model, etc.



To develop such a key, we tried all the most popular libraries, but could not find a marker that would fit any combination. When we use marker augmented reality, it sits on key points. The marker is black and white, and key points are concentrated in changing places. In the end, we decided to write everything from scratch.

We used the MSER search algorithm, which simply finds an area. After all, we know that for sure there is a black key and for sure there is a white cross inside this key. Therefore, we first find a large black area, and inside this area we find a white area. Then we cut the picture and look at the aspect ratio - it should be 2: 1. Next, analyze the form. Focusing on the cross, we can find the beginning of a key phrase. As for the points, they are always in the same places, so finding them is also not difficult. As a result, we got an algorithm for searching for a marker by form. This, of course, is not a universal solution, but our task is simply excellent.

So, on the iPhone 5S, we got a performance of more than 25 FPS. To achieve this was quite difficult. First, as with any algorithm, we have reduced the picture: the recognition algorithm works much better on a smaller picture with low quality. Then they implemented the prediction algorithm - after they found the picture, we assume that the key cannot fly away from the frame by more than a certain pixel value. Then shorten the picture. After that, we analyze the dynamics: if the user turns the phone to the left, the key will move to the right. This is a probabilistic algorithm. if we don’t find right away what we are looking for, we start processing a larger area. We have an excellent model rendering algorithm, which was written from scratch.

What else do we have? On the key there are three rows, in each of them - 13 points. This means that 469 combinations are possible. Since at a distance of more than a meter the picture is already somewhat blurred, we made a probabilistic algorithm for decoding with error correction. We use it in conjunction with a self-correcting key. So we accurately identify four false signs, which is enough. We also have an optimized detection algorithm, a tracking and prediction algorithm for the next position.

Despite the fact that such a key is somewhat similar to a QR code, there are fundamental differences. You cannot link the augmented reality to the QR code, because its content is constantly changing. In other words, you cannot create it as a marker. You cannot put a 3D model on it and cannot determine the angle of rotation. In addition, this key is very easily recognizable.

Football Clubs Recognizer



We also developed an application that helps users follow their favorite football clubs. It allows you to complement the virtual reality image of the logo of a football club - when you hover the camera on the logo shows the data of the club.

Source: https://habr.com/ru/post/391055/


All Articles