This is a continuation of the
article , in which I will talk more about the theory and practice of video encoding with Xvid, and also present an improved version of my matrix, in which the “quality / size” indicator has become larger. Details under the cut.
Recently, the Xvid codec turned 7 years old, and in general this principle of video compression exists for 15 years, but there is little travel information on coding on the Internet in Russian. For starters, you can read this:
video compression ,
MPEG-4 ,
Xvid .
I’ll only briefly touch on the theory (to be honest, I don’t really know more). Before encoding, the picture is divided into 8x8 blocks, for each block there are average values of brightness and color, and all pixels are converted into a mathematical dependence on average values. A discrete-cosine transform is applied to the resulting matrix using quantization matrices. Intra matrix for key frames, Inter for all others. The upper left corner of the matrices is used to quantize values that differ little from the average value, respectively, and the lower right corner is used for values that differ greatly from the average value. The greater the coefficient in the matrix, the more coarsening the values of brightness and color. Odds are in the range of 8-255. On this quantization is not completed, before the final packaging, all values of the block are divided by the quantizer specified in the Xvid codec setting. In the photo Target quantizer (highlighted in red, the photo after the photo of the matrix).
Quantization is the coarsening of the signal. When encoding, the signal value is divided by a certain number, and during decoding it is multiplied. Since the fractional part is discarded, for large values of the quantizer, many details are lost and 8x8 blocks become too noticeable. But the file size is also very small. Therefore, the task of coding is to reduce quadraticity in files of small size. And developing your own matrix is a must.
')
Types of frames . Simply put, frames are key and non-key. Non-key ones depend on the key ones, they show the change of the picture (movement). The photo of the Xvid settings shows the standard value of the interval of key frames 300, the settings are invoked by the lower “more” button in the main Xvid settings window. Since a non-keyframe takes 5 times less space than a keyframe, we owe it to the small file size. A value greater than 300 is usually not used, because this interval lasts for 10-12 seconds of viewing. In terms of keyframes, this setting will take 5-7% of your file without audio.
In the photos below, I will show how the quantization matrix coefficients and quantizer affect the picture quality.
For the experiment, I used the following Intra matrix:

As you can see, I made the first coefficients maximum in order to see in which places the greatest distortions will be.
All pictures in scale 300%. This is a picture before compression:

This compression with quantizer 2:

This compression with quantizer 4:

As you can see, quantizer 4 makes the squares more visible, and the coefficients 255 in the matrix roughen most of the brightness and color values, so that the block spreads into a square spot. If the blocks were round, then it would look like a bokeh - a background that is out of focus. And the fact that the details of the picture are still visible, it is because we left the remaining coefficients minimal. Quantizer 1 is not recommended to use because of the peculiarities of the codec, and I do not use more than 4 because it distorts the blocks greatly.
Now about my matrix. Her new look is:

It can be applied in three modes. The settings for all modes are the same, except for one parameter, the main Target quantizer quantizer (highlighted in red).

Important settings highlighted in red are recommended to be set as in the photo (detailed in the first post).
1. High quality. Target quantizer 2. The quality is excellent, the file size is 80% (the file size is taken as 100% when encoding with the standard H263 matrix with the same settings. From this size, “dance” in all modes).
2. Compromise. Target quantizer 3. Quality is normal, size 54%.
3. Compressed. Target quantizer 4. Quality is satisfactory, size 40%.
As you can see, in the third mode, we have a file 2 times smaller than in the first (without audio), but we significantly sacrifice quality. Although the word “significantly” has a very relative meaning. The fact is that I compared samples that were simultaneously opened in separate windows by VirtualDub, frame by frame at 200% and 300%. If you encode a movie with a resolution of 1280x720 in the third mode and just watch it in the player, you may not notice the difference. The first mode is recommended for low resolution videos or for videos that are very dear to the heart. The second mode is suitable for most movies, and the third for high-resolution video or for commercials, where it’s not the quality that matters, but the material itself.
Choosing the third mode, do not forget about the sound quality, for audio in the 5: 1 format 384 kbps takes up a lot of space (in the film, usually 300 MB). To transfer audio from any format to mp3, I recommend Format Factory. It is free, encodes video and audio. Also for information: the minimum mp3 audio format, which maintains normal sound, is mono 48 Kbit 44 KHz. Do not encode at 22 KHz, because it is much worse on any bitrate. Also, do not encode to 48 KHz if you have a lower frequency source or you do not know it. Changing the frequency usually leads to distortion. 44 KHz is the safest frequency.
Many will have a question: why is it necessary to encode with a constant quantizer? Because in any other mode (two-pass mode, constant bitrate) you do not control the quantizer. Namely, the quantizer is primarily responsible for quadraticity (see experiment above). Also read in the first post about the advanced settings Xvid, without them using my matrix would not be effective.
Encode on health!