Adding synchronized transcription text to HTML5 video

Modern web is already quite difficult to imagine without video, however, it can often be difficult to directly perceive speech in such a presentation, for example, in the case of users with hearing problems, people hesitantly perceiving live speech by ear, etc. In such a situation, HTML5 will help ensure the availability of content. providing the functionality of adding subtitles with transcription to the media files, that is, with a text recording of speech.

Transcription is specified by a VTT document, which is added to the original video in the <track> tag.

<video class="span12 readable" poster="avas.jpg" controls tabindex="0" title=""> <source src="avas.m4v" type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"' /> <source src="avas.ogg" type='application/ogg' /> <source src="avas.webm" type='video/webm' /> <track src="avas-transcript.vtt" label=" " kind="subtitles" srclang="ru" default /> </video>

VTT file is a specially formatted document. It contains a numbered queue, the start / end time, and the text itself. It is recommended that the text of the transcription include not just a recording of speech, but also the name of the speaker. The text itself in VTT can be formatted with several basic tags, such as <i> or <b>.
The document begins with the announcement WEBVTT:
')

 WEBVTT 1 00:00:02.000 --> 00:00:07.000 <i>  :</i>   ? 2 00:00:09.000 --> 00:00:11.000 <i>  :</i> . 3 00:00:13.000 --> 00:00:18.000 <i>:</i>    ? 4 00:00:20.000 --> 00:00:21.000 <i>:</i> . 5 00:00:22.000 --> 00:00:27.000 <i>:</i>   ,  ? 6 00:00:29.000 --> 00:00:30.000 <i>:</i> ! 7 00:00:31.000 --> 00:00:34.000 <i>:</i>   ,  ?! 8 00:00:37.000 --> 00:00:38.000 <i>:</i> !!!

In addition to the VTT file, you can also use TTML (Time Text Markup Language), which is plain XML, also supported by Flash and some other common technologies. So when migrating from Flash to HTML5, this may be the best option, because you can use an existing document with transcription.

 <tt xmlns="http://www.w3.org/ns/ttml" xml:lang="ru"> <body> <div> <p begin="00:00:2.00" end="00:00:7.00">   :   ? </p> <p begin="00:00:9:00" end="00:00:11:00">   : . </p> <!--   . --> </div> </body> </tt>

Thus, it is necessary to decrypt the sound of the video, prepare the marked transcription file and add it to the <video> tag.

Transcribing video can be outsourced, especially when it comes to processing a large amount of material, for example, it may be relevant for government sites that require all content to be accessible to users with any restrictions, including voice recordings for deaf visitors resource.

Source: https://habr.com/ru/post/187046/

All Articles

Adding synchronized transcription text to HTML5 video

More articles: