Programmers from the University of East Anglia have
developed a computer system capable of recognizing speech from lip video recording. The system can help in the investigation of crimes or various incidents - especially in those cases where the audio track is absent, damaged or useless due to background noise.
Video recordings in places of entertainment, in cars or in the cockpits of airplanes and other vehicles, are made without an audio track, or the audio contains too much noise. In difficult cases, investigators invite a lip-reading specialist — usually deaf people, or those who work with deaf people.
Reading lips is a more difficult task than speech recognition, both for a person and for a computer. Some letters, especially vowels, are well recognized when they are pronounced. Some (for example: k, z, x) are not “visible” at all outside. An experienced person, who recognizes speech, selects the appropriate words and completes the sentences by meaning.
')
“For the time being, we are still just learning the science of recognizing visual speech and what is needed to create a reliable recognition system,” says Helen Bear, one of the creators of the system.
“Reading lips is one of the most difficult tasks for artificial intelligence, so it’s great to make progress in this area in such complex things as learning the machine to recognize the appearance and shape of a person’s lips,” explains Richard Harvey, professor, participated in the creation of the system.