📜 ⬆️ ⬇️

How we did the multitouch table

Hi Habr.
Being engaged in Computer Vision, I was interested in Natural interfaces, communicated with people who design tables for bars with touch interfaces. And I had an idea to make my own. Cheap, angry, but the main thing is that everything works. That is, it is important to test and test. And here, my friend Alexander Zhedelev , a musical producer of the Russian Drama Theater of Estonia, suggested making some new musical instrument for performing at the Tallinn Music Week festival. There was little time, and we started.

In general, there are several approaches to the creation of this type of tables. I will try to describe three of them.
The general principle of the construction of such systems is as follows. The image that the projector gives out is displayed on the glass behind which there is a coating, as which you can even take baking paper. Touching reflects any part of the radiation back; you can put a camera under glass and detect touches. In the figure below you can also see that in addition to the projector, the hand from the bottom is illuminated by infrared sources (this is one of the ways to build such systems).
image
Since we need to catch only touch gestures, not a picture from the projector, we need a way so that one does not affect the other. For this purpose, cameras configured to receive only an infrared image are used. The usual sensor matrix camera responds to both the visible and the infrared part of the spectrum. In the video below, you can see how the camera of the Sony Xperia phone responds to the infrared beam of a remote Sharp sensor. The eye does not see this beam, and for us the sensor in the operating mode is perceived as dark.

Typically, the camera is a filter that cuts off the infrared and transmits only the visible spectrum. For our purposes, we need to redo all the way around. That is, remove the lens, remove from there a filter that only allows visible light to pass through, and install a filter that only allows infrared radiation to pass through, since we will recognize gestures using this part of the spectrum. Generally any webcam will do. I took the good old PS3 Eye, because it gives the best result in price / quality. Removing the filter itself is also not difficult. Like this.
In order to cut off the visible light, it is necessary to assemble the camera and put in front of the lens a filter that passes only infrared radiation. It can be bought at the radio store. It looks like dark red glass. Now we have a camera that perceives IR radiation. You can test it immediately, the image from the projector should look like a gray monotonous background.
Now about the approaches themselves.

The first is an expensive, but reliable enough, it is to hold on the sides of the table frame, immediately under the glass, a chain of infrared LEDs that will highlight the hands. Pros - a very clear image of the reflection of the fingers, which is achieved due to the proximity of the emitting LEDs. Cons - additional difficulties in the installation of the diodes and the general scheme of their power. As a rule, this scheme is used in commercial approaches.
image

The second is to put additional sources of infrared radiation under the glass next to the projector, as in the figure at the very beginning. I tried this approach, but revealed a few flaws. The radiation must be sufficiently scattered, otherwise some parts will be illuminated more than others. And there is the problem of glare. I experienced a bunch of IR LEDs, but they gave a very strong directional radiation, which was glare. No diffuse filters helped, the glass partially reflected the flow, and the camera fixed a permanently present spot. In general, this is an unfortunate approach, and I would not recommend it. Uniform lighting is difficult to obtain.
')
The third one that I used is the easiest and most effective. In fact, we do not need a separate light. The fact is that the projector lamp emits a fairly wide spectrum of radiation, including infrared. It is enough to look through our modified camera on the projection, and we will see a uniform gray background. Bingo.

The frame itself was assembled from wood. Plexiglas planted on it. The camera was directly under the glass.
This is the AudioKinetic plan. Below you can see the table layout drawn with a marker.
image

But the table itself.


To recognize the BLOB (these are the reflections that the camera sees), I used Community Core Vision, CCV. This is a ready-made solution that relays the recognition and transmits the results via the TUIO protocol. TUIO is an open framework that defines the protocol and API for building multitouch interfaces. In principle, if there was more time, I would write my BLOB detector on OpenCV, since there is nothing particularly difficult for this task. The spots are well seen, and on OpenCV it would look like this - we get a picture, remove the noise, build a bitmap card, skip the Canny algorithm, find the contours and then translate their coordinates to TUIO objects by specification. It would take a calibration module to set the coordinates. CCV uses the subtraction of the background image from the resulting image, there is also an adaptive mode, which takes into account the slow background changes. At OpenCV, this can be implemented using the codebook and connected components.
Now we have a system that translates TUIO objects, and we can use everything that accepts these objects, or write to the client. For example, in Java this is done quite easily, there are many examples on the net.

CCV settings.
image

Further, since it was planned to use this table to control sound synthesis, the TUIO module for Ableton was used, which allows you to attach gestures to sound generation parameters, instruments, and so on. After that, Alexander Zhedelev worked on setting the key, pairing with other records, and generally experimented as he wanted. At the end of the video is shown about how and what.


And here is the edited version. Look in the headphones.


As you can see in the video, we have already played, put paper on glass and painted on it. Received playing drawing.

One observation. It is not necessary to touch the glass. Already at some distance from it a reflection occurs, and the camera captures it. There is space for the game with the settings, and as a result, you can make an air-touch screen.
There is one disadvantage associated with the TUIO protocol, its objects need to be relayed. I.e; if I want to run beautiful visuals and sound synthesis in parallel, then I need a repeater, because if the interface module accepts a TUIO object, then let's say Flash visuals can no longer see this object.

I want to say thanks to all the people who participated, especially Alexander Zhedelev , Sergey Dragunov, Krista Koester.
Audiokinetica

Source: https://habr.com/ru/post/249399/


All Articles