📜 ⬆️ ⬇️

Researchers have restored the sound from the vibrations of objects in the video



The sound is the oscillations of a certain frequency, which propagate in the space surrounding the source. These waves reach nearby objects and cause them to experience vibrations. A group of researchers at the Massachusetts Institute of Technology has been able, based on these vibrations visible on the video, to partially restore the original sound with some distortion.

Abe Davis, Michael Rubinstein, Neil Wadhva, Gautam Mysore, Fredo Durand, and William Freeman used a camera to record video at a frequency of several thousand frames per second, and objects that are widespread and subject to vibration, such as foil bag of chips, house leaves, facets boxes of napkins or a glass of water. It will be quite difficult to find such a video camera in everyday life, but their other technology has shown that sound recovery is also possible using normal recording at 60 frames per second.
')
The quality of the recovered sound allows you to separate individual words and has a relatively high signal-to-noise ratio. The recovered audio recordings even make it possible to vaguely distinguish between human speech or use music recognition services.

In the video above, at 00:45 or on the project page , the original sound is shown (the researchers used the well-known song “ Mary had a lamb ”, well-known to anyone interested in the history of sound recording) and the restored sound, while the high-frequency video vibrations are not visible to the naked eye - vibrations reach less than one hundredth of a pixel.

Then, in the video at 1:50, the original sound recorded by the cell phone microphone and the restored sound of human speech are demonstrated. At the same time, the camera was located at some distance from the package of chips vibrating from the sound waves, and between it and the object there was glass, which increased the complexity of the task. The researchers again used the first recorded on the phonograph by Thomas Edison songs.

At 2:35 it is shown that music recognition services are able to “recognize” the restored audio recordings, in particular, the Under Pressure song of the Queen group was recognized.

The above results were obtained from cameras with a shooting frequency of thousands of frames per second. But it was also shown that the artifacts of shooting ordinary household video cameras (in particular, the rolling shutter ) can sometimes be used to get sound with a frequency much higher than the frame rate of the original video.

The results of the modified technology can be seen at around 3:35, the researchers were able to restore the frequency more than five times higher than the video frame rate. The same MIDI file with the melody of the child song was used.

More information and audio recordings are available on the project page . A group of researchers promises to publish the project code in the near future.

Source: https://habr.com/ru/post/232245/


All Articles