On the verge of augmented reality: what to prepare for developers (part 1 of 3)

annotation

I present to you the materials of the report of the same name, which I made at the ADD-2010 conference last fall.

The report, after a brief insight into history, examines the current ¹⁾ state of affairs in mastering augmented reality:

use of sensors: accelerometers, gyroscopes, magnetic compasses, GPS
(various mobile phones, Wii, Sony Move, XSens)
use of markers: with infrared light, in the visible range
(Sony Move, OptiTrack, Vicon)
use of structured backlighting
(for example, Microsoft Surface)
Z-cameras, which give a picture + distance to the object
(Microsoft Kinect, its predecessors and analogues)
markerless motion capture
(OrganicMotion, iPi Soft Desktop Motion Capture)
scanning 3d-surfaces on a set of images:
(stereo cameras, 3d-scanners, recreation of a 3d-scene from a moving camera)

Behind each of the technologies are certain algorithms and areas of Computer Science. To create successful applications of this class, it is necessary, if not thoroughly understood, then at least navigate. For example, the report covers in sufficient detail the processing of data from gyroscopes and accelerometers. Despite the seeming simplicity, even here there are difficulties and the place is not exactly banal mathematics. The remaining algorithms are considered superficially and greatly simplified.

Materials

Transcript

Video transcript recorded belonesox . ²⁾

WTF?

Hello, my name is Andrey , I want to tell you a little about augmented reality. The report is slightly mixed, partly populist, partly in it there are elements of something serious, but on the whole a lightweight report.

As a matter of fact, now augmented reality or Augumented Reality is such a buzz-word, you can hear it everywhere, “ Aaa , augmented reality, augmented reality”, the question arises - “What is this?”.

The most banal explanation is the Terminator, where he looks at our world with his own eyes, and to him, to the information that is just a picture of our world, is added some additional, very informative - “Subject Unknown” ... Hmm, this is the phase in which Augumented Reality at the moment is. So far, often this “useful” information is added.

So this is when you want to add some additional information to our world around you.

We are all impressed by the recent football championship. Augmented reality is already being used on television: for example, when they suddenly begin to carry out like offside, the ball flew away. And it is even impossible to watch hockey: the puck is illuminated to be chained to the TV.

There are actually much more interesting applications, this is the development of navigation systems, everyone is already tired of looking at the map and trying to figure out where and how to turn.

It is better to show the image of the street, and directly on top of the image to show the route, so that you can easily compare what you see with what the computer tells you.

So far, of course, it all works on top of Street View, but this is generally quite good, although it is not a live picture from your camera.

Here is an example of a video when a child brings a box, the camera removes the box and, in fact, the child sees on the screen on top of this box a picture, a model that he is asked to assemble and buy. Quite an interesting application.

If you google the augmented reality, you will find some such pictures when there is some sort of label, and on top of it is some kind of three-dimensional picture, often some kind of stupid demonstration. It is not clear why this is in life, in contrast to the box with the designer, but now, basically everyone is indulging in this.

This approach, when you have some kind of camera, and it takes a picture, you slyly or analyze it, and based on this, render something on top of this picture.

There is a different approach.

Now there are a lot of mobile devices, and these devices began to stuff not only with cameras, everyone was accustomed to them, resigned, began to be equipped with interesting additional things. For example, a compass electromagnetic accelerometer.

In fact, all these things help to understand one great thing - how your device is oriented in space. Where it is located and how it is oriented.

Accordingly, plus you have a video camera and plus a screen on which all this can be shown. After that, you look at a certain object through the camera, because you can calculate exactly where your phone is directed, you can figure out what kind of object there is, if you know the GPS coordinates of the object, and actually, output over the picture there is some information that this is such a monument, or such and such a ruin. This is something that is now quite fashionable, there are applications based on Google Maps, and the like, for mobile phones that eat traffic, mobile operators are happy.

For example, some big mayor (“it would seem where Luzhkov is here” ©) wants to see what some future hotel will look like or something else. He directs his device to this place of the city, and it will be rendered right there on top of this picture. Here, for example, you can think of and sell expensively to the Moscow authorities.

This approach is based on a slightly different mechanism, here you most likely do not need to know anything about the picture, the main thing is that it came in and here you analyze how the camera is located and oriented relative to our Mother Earth.

AR vs. VR

A reasonable question arises: how does this compare with virtual reality, which is already twenty years old at lunchtime?

If the “classic” of augmented reality is the “Terminator”, then the classic of virtual reality is the film “Lawnmower”. He has been in school for many years already, I remember at school, and there the rendering quality is about that.

We all associate virtual reality with some crazy helmets, such is the thing. More seriously, we have such a, almost scientific conference, we definitely need diagrams ...

And if virtual reality is basically some kind of generated content, and a little bit is taken from our world - they adapted the gloves on their hands, something else, but the main emphasis was on generation. That augmented reality is, on the contrary, when we take something from our real world and try to add something from our digital world to it.

But another difference is much more interesting. Here we have a computer, there are some input devices, and processing of this input. There are output devices and, accordingly, rendering and data preparation for them.

Virtual reality and all innovations in it were mainly related to rendering and output devices. Actually they are now continuing, now a surge of 3D. That augmented reality, so it turned out, it is more about input devices, and how to handle it, new input devices and processing algorithms.

A LA HISTORY

Let's quickly go over a brief history.

Everywhere here are the dates when it went to the masses. Very many developments were invented earlier, much earlier than it really was used.

The keyboard, namely electric, is the beginning of the seventies. Touchscreen a little later developed, and applied, it is also the beginning of the seventies, quite an ancient thing.

Further great innovation is, actually, a mouse. And first, it was not Apple that applied it in mass production, but Xerox itself, in 1981 they released Star Mouse. Here on the left, in fact, it is, and on the right - modern. Those. you see that the progress during this time is not very much gone.

Ten years passed, during which nothing happened, webcams appeared, a scroll appeared on the mouse that simplified our life on the Internet, this is a very significant innovation in input devices.

Ten more years passed, and now we are witnessing a boom. First we went to consumer goods multitouch with apple phones. Then in 2007, WiiMote appeared, into which the accelerometer was stuck, and tried to adapt to the games. It turned out that only the accelerometer is bad, they attached an extra to it, it is connected to the bottom, and a gyroscope (WiiMotion +) was also built in there.

Right now a lot of smartphones are coming out, where all this is built in right away - a gyroscope, an accelerometer, a GPS, a video camera. Here, for example, the apple phone is the last, but a bunch of Android-based phones, they are stuffed in exactly the same way.

And what is happening right now? It turns out Sony Move, right now you can already start buying it.

An interesting device. Some people think that Sony Move is such a Wiimote, only in high quality, but it has some fun, useful features that we'll talk about. He tears a lot.

Here, strictly speaking, the Sony Move presentation, let's see a little bit excerpt. This is a play-play gadget, but even the developers themselves say that they use it on a PC in order to move objects in 3D MAX space.

This device, which is equipped with an accelerometer, gyroscopes, and a magnetic compass, and plus a luminous ball, which can track its position in space. It turns out very good tracking of your hand, so that you have a fully functional User Experience. Powerful stuff, though it seems simple.

And of course, the most sensational, this is Microsoft Kinect. This is a sensor that measures the depth, i.e. Not only the RGB image, but also the distance to the object. It should be out in November (2010). And actually, this thing is also for games, but on a different principle.

It, plus the fact that it receives the RGB image, it also receives the depth matrix for the object, i.e. distances. Roughly speaking, the laser range finder, only on other principles. Due to this, they managed to make tracking of people.

This is such an interesting video, this video is quite informative, unlike those games that they show. For the games they show are completely primitive and terrible, where you have to put some paints on the wall, when you can do the same with more primitive devices and algorithms ³⁾ .

Strictly speaking, all this is happening now, right before our eyes, and the first such true, augmented reality was, again, all this developed earlier, all these Apache helicopters went into mass production in 1984, a helmet with a very interesting one was used there. a thing. On one eye, at the expense of such a translucent display, a picture was projected, supplemented - i.e. some markup, where you aim, some heights, a lot of some data, and the helmet itself was equipped with a gyroscope, there were gyroscopes in it, almost real, and still stand - it all flies. Due to this, it was clear where the pilot was looking, and information was projected exactly in his view.

This is quite a hell of a thing, because pilots complain that after using it, their head starts to ache wildly, crack and the like ⁴⁾ . And you can not complain, because you will be immediately dismissed from the army due to lack of professional skills, while in America they pay a lot in the army. Therefore, they all endure, and after a few months the head goes away, but after a year or two it starts to hurt wildly again ... One of the pilots decided to observe his vision, while using this helmet, it turned out that his eyes move independently, like a chameleon.

The thing is that the projection takes place on one eye, and the other eye does not see it, because of this, the brain hurts, it cannot correlate these pictures. After a while (the human animal is adaptable, people are not pigs - they will eat everything), they adapt, the head seems to be passing, but all this leads to such consequences. Therefore, when I see that now they are trying to release such a device into consumer goods - do not buy. I think they will fail, for headaches and nothing good ... Well, if a little, and if you really want to like the pilot ...

How does it work?

Let's move on to the most informative part of the report, how it actually works, and how all of this can be used. Let me remind these two pictures that there are two different principles:

Image analysis.
When we know the position of our camera, and where it looks, the image itself doesn’t really bother us: the main thing is to show it to the user, and draw something over it.

So it can be represented that either we use the camera as the main input device, or we use a special sensor.

From the point of view of what to do with the camera, how to analyze its image, there is a whole explosion of all sorts of different approaches, and in general, directions.

From the point of view of sensors, everything is much simpler, their set is quite stable, let's talk about it first.

To be continued

Notes

¹⁾ In the six months since the report, something has changed. Relevant references to this are in the proper places in the transcript.
²⁾ Here the transcript of the report is given in a somewhat truncated form. The full version can be found on the link .
³⁾ Moderate skepticism of the author was not justified. Kinect entered the Guinness Book of Records as the fastest selling gadget. And the children, and even adults, are very fond of those simple games that come with it.
⁴⁾ See For example, a note about the adaptation of Apache helicopter pilots to a dual-view helmet (The book of the famous helicopter pilot - Ed Macy , partly available for free online ).

Source: https://habr.com/ru/post/118123/

All Articles