The material of the article reflects the personal experience of the author and does not pretend to scientific accuracy. I will be glad to any corrections and additions. For those who do not want to read further, the correct answer to the main question: MJPEG.
Introduction
Clients often prefer archaic ways of transmitting design materials and cases where a 5-minute film sent by mail attachment compressed to 20 MB is not uncommon. The material for familiarization becomes the material for the work, which entails a number of unobvious problems, the main of which are low image detail (caused by excessive compression) and the use of video codecs that are not intended for audio editing.
Low detailing, pixelation and general motion blur make it difficult for the sound engineer to work at the very beginning, when there is an assessment of the plot and visual elements of the film that can be voiced. Hence, there is such a problem as an aesthetic mismatch, when, for example, a plastic (by design) object is sounded as metal or glass.
')
Poor picture quality also makes it difficult for the sound engineer to determine the beginning and completion of dynamic visual events, which leads to their vague synchronization with audio. But most often, the lag or sound ahead of the picture is related to the features of the video codec that was used in the audomontage process, which will be discussed below.
general information
But first, some information about what a video file is all about. In short, this container is a metafile that contains several streams of data. Streams include audio, video, images, subtitles, menus, chapter information, metadata, tags, etc. Inside the container there can be several streams of the same type at once (for example, 2 video tracks, 3 audio tracks, subtitles in several languages), and each of them can be compressed with different codecs. Here it is worth recalling the terms:
- mux - packing multiple streams in one container
- demux - extract streams from container to separate files
- remux - replacing one or more threads in a container
All these operations are without loss of quality, i.e. they have no effect on the contents of the streams.
AVI, FLV, MOV, MP4, MKV, OGG, TS, WebM - these are not video codecs, but containers - the video file extension almost does not reflect the nature of the content. Video codecs are DivX, XVid, H.264, MPEG, MJPEG, Theora, VP9 and they are of three types:
lossy ,
lossless and
intra-only . It is codecs that determine image quality and suitability for audio editing. About intra-only will be discussed below, and the principles of the first two types are well described in
this article . In short, the codec divides the video stream into groups of frames (
G group
O f
P ictures) and to reduce the file size, only the first frame (i-frame) is fully stored in each GOP, and the rest (b- and p-frames) contain Only information about changes in the picture. As a result, the structure of each GOP looks like this: ibbpbbpbbp. The more compressed the video, the higher the threshold for passing changes to b- and p-frames. The longer the GOP, the more problems there will be with rewinding (sticky frames, etc.). Hence the conclusion: for audio editing, lossy and lossless codecs are conditionally suitable only if the video was converted with a small GOP value.
Synchronization
Synchronization of streams is carried out through timestamps (timestamps), which are generated by the codec during (de) coding. If an error occurs at that moment, the codec skips such frames and assigns the timestamp of the problematic packet to the next non-problematic one. As a result, out of sync the “broken” stream with the others occurs. When re-converting such a file to lossy / lossless format, the effect may increase.
Intra-frame
A distinctive feature of the intra-frame codecs is that each frame of the stream is a key (i-frame). Inadequate intermediate frames are missing. One of the most popular codecs of this type is
MJPEG (Motion JPEG) . It converts a video into a sequence of independently compressed JPEG images.
Pros MJPEG:
- fast conversion speed
- smooth rewind
- suitable for audio editing
Minuses:
- file size can be quite large
You can convert any video file to MJPEG using the
ffmpeg utility. The command will be something like this:
ffmpeg -i input.avi -c:v mjpeg -q:v 1 -c:a copy output.mov
In order not to perform this operation every time from the command line, create a script like this (for Windows) and simply drag and drop video files onto it (several can be done at once):
for %%A in (%*) do ffmpeg -i %%A -c:v mjpeg -q:v 1 -c:a copy "%%~nA"_mjpeg.mov
For convenience, a shortcut to this script can be thrown into the SendTo folder (in the shortcut properties you will need to clear the “Start in” field).
Finally
In advance, ask the client to send the video in good quality (not very compressed lossy or lossless), then convert any video sent to MJPEG and do the voice acting for this format. When the sound is ready, remux the client video by adding your audio track to the container. Some containers have limitations, for example, MP4 does not support audio streams in PCM (WAV / AIFF), the sound in this case will have to be transferred to MP3 or ALAC. Detailed table of compatibility is on
Wikipedia .