📜 ⬆️ ⬇️

How we animate the presentation



Our company is developing systems for shooting video and broadcasting and webinars. The project began in an ordinary apartment with the filming of the educational project Skill Up. It was a non-profit project, so everything had to be done independently. Often, the speaker had to prepare himself, shoot and edit the video. This beginning largely determined the direction of development and features of our system.

After a brief search, it was decided to assemble the glass board and write on it with markers.


')
And it was a good decision. The picture looked natural, as teachers used to write on the blackboard, and at the time of writing they looked either at the viewer or at what they were writing.

But people write, and even more draw, and even more draw beautifully, not so fast. It was decided to add some pictures. And we tried several ways:

  1. Showed pictures and text on the teleprompter. In this case, the speaker constantly looks into the camera, and the objects are simply superimposed during installation.
  2. Translucent images were displayed on top of the video and projected onto the wall in front of the speaker. At the same time, the speaker looks at the elements he talks about. However, the direction of his gaze is quite different from the expectations of the viewer.
  3. They displayed translucent images on top of the video, but now they put labels on the glass where the elements are located. Now the lecturer is looking where it is necessary, but only labels must be set for each new object and remember what each label is responsible for.
  4. We drove our hands in the place where the object is expected to appear, and then substituted the objects during installation. But this option turned out to be extremely laborious, both in terms of post-processing and in terms of filming.

Then we conducted an analysis of the market, which showed that the majority use either a “talking head” along with a presentation, or expensive methods (for example, shooting with chroma key). And we started developing our own software product. At the same time, they tried to ensure that the speaker simply downloaded our program, made a presentation in it, conducted shooting in the studio, and then cut off the beginning and end of good duplicates using the built-in Windows or YouTube tools and received the finished video. Further, by removing the bottlenecks, we were able to start broadcasting without losing quality.

Consider how everything works:


The video camera shoots the lecturer. Video and sound come to the local server where the presentation is overlaid. The resulting video stream is saved to disk and sent to the broadcast server to which viewers are connected. The same video stream is displayed on the screen, located in front of the speaker. Thanks to this screen, the speaker sees himself and the elements that surround him.

The picture projected on the screens is distorted so that the viewer has the illusion that the speaker is looking at these elements located on the plane in front of him. The image on the screens looks like this:



When the speaker begins to move, the video on the screens should be distorted so that the eyes still look at the objects on the slide. Below is shown how the image transmitted to projectors is distorted when a person is moved from left to right.





It is seen that the images on the left and right projectors are distorted in different ways. This is due to the fact that the projectors are located at different angles and at different distances from the screens.

The system allows you to work with screens of various shapes. For example, we have a studio in which the projection takes place directly on the walls, which is convenient for small rooms:



Now it is possible to look at the object, show it to the elements with your hands and write with a marker during the shooting. But I wanted to add more interaction options, and then our choice fell on Kinect v2. Now we have the coordinates of the head, hands and information about the three gestures: "fist", "palm" and "lasso". Next, the coordinates from Kinect had to be converted to the coordinates of the main camera. And here OpenCV and Habr habrahabr.ru/post/272629 came to the rescue. Many thanks to the author for the sensible article. I would like to share my experience: use a large size chessboard. Now we use a board of 6x9 cells and the size of A0. This increases the accuracy of the conversion and significantly accelerates the convergence of the algorithm.

First, simple features were added: draw or erase with the help of a hand, move objects, hide them or make them visible. Next, active elements were created: a “map” on which you can navigate and change the scale, a “text” with the possibility of vertical rewinding and a “browser”, which can be used almost in the same way as in a regular touch interface.



And that's what happened:


And you can add a glass board:


Before the advent of active elements, users created presentations in Power Point and imported into Jalinga Studio using the built-in mechanism. But these changes required us to create a fully-fledged built-in editor for creating presentations.



An important feature of the system is the ability to independently manage the shooting process. Using the joystick and hand gestures, the speaker controls all stages of the shooting or broadcasting process.



The speaker has a lot of control over the process of shooting: start and finish video recording, mark successful duplicates and move on to the next scene, draw, erase, interact with active elements, manage feedback, etc.

On how the program part is arranged, we will discuss in more detail in the following articles. For now, we only mention that the system does not require a lot of resources, and a computer with the following configuration can be used as a server: i5-4460, DDR3 8 GB (2x4GB), GeForce GTX750Ti, AVerMedia Live Gamer HD.

So, in general terms, we tried to revive presentations already familiar to everyone. In the near future, we plan to focus on new ways to interact with the audience during the broadcast and webinars. For now we will be glad to comments and questions.

Source: https://habr.com/ru/post/317082/


All Articles