TV / PC video band problem

Hi, Habr!

I want to tell about my recent research, in which I studied the problem of inconsistencies of TV / PC ranges during video compression / playback. This problem is rather petty, but at the same time, quite massive, because of it, I often blamed compression codecs in changing colors.

')

Example

Imagine you just came back from a vacation, were in town N and made superb photos at an event M on your undoubtedly excellent camera F. You drop photos onto a computer and you know - itching to make a video collage and show all this beauty to friends . It is said - done, install your favorite video editor, correct the contrast, gamma, do everything so that it is as beautiful as possible. And now the project is ready: in front of you is a freshly baked video file. Run to test and ... Something is wrong, as if reduced contrast and saturation. What if another codec ?? Tried another codec - the same thing. Okay, the main thing is that everything on the DVD is fine. You record a clip on a DVD and the colors really fall into place ... How is that?

Prehistory

It all went since the days of analog TVs, with such an amazing thing as a kinescope. The first kinescopes were deprived of color, that is, the shades of brightness were displayed on it - the more intense the flow directed the electron gun to a given point, the brighter this point was glowing, everything is simple. In the same way, the signal was transmitted to this TV, that is, the brightness component of the picture was broadcast.

But, the black and white picture didn’t satisfy people very much, after all, our eyesight is color, so very soon color television appeared. Color televisions also used a kinescope, but red-green-blue pixels were used to convey color in it, just like in modern monitors. Then there was the question of how to transmit a signal, so that while maintaining backward compatibility with black and white TVs? Of course, it would be possible to broadcast on the neighboring frequency in addition to brightness, also 3 color channels, but, first, the frequency range allocated to the channel was rather narrow, and secondly, it is more complicated, because we get as a result 4 color channels. So mathematicians and engineers invented to transfer, in addition to the brightness channel, two color “chromatic” - blue and red, obtained as a difference in brightness from the corresponding color channel, receiving from RGB a color space (Y, Cb = BY, Cr = RY). The entire color range can be calculated from these three channels. Moreover, a person’s vision is less sensitive to colors, which is why it reduced the resolution of color channels by half, almost without losing image quality. That is, the resolution of the luminance channel in the PAL-frame is 720x576, and the colors for it were transmitted in the resolution of 360x576 (the so-called 4: 2: 2 color sub-sampling). But how to convert Brightness (Y) and chromatic color channels (Cb, Cr) to RGB and vice versa?

So, in 1982, the standard of conversion was laid YCbCr <=> RGB, it is called CCIR 601 (since 1992 - BT.601). Based on the results of numerous human color perception experiments, brightness is defined as the sum of red, green, and blue with coefficients 77/256, 150/256, and 29/256, respectively.

As a result of the conversion from RGB => YCbCr, a reduced range of luminance (16-235) and chromaticity (16-240) is obtained. As stated in the standard, the values 0 and 255 can be used for synchronization, and the values 1-15 and 236-254 are considered incorrect, and will be displayed as black and white. Later, these narrowing ranges were transferred to digital video, with the result that a narrow range became the standard in video. Although for high definition video a different color conversion standard, BT.709, differing only from BT.601 by coefficients, was developed.

How should the video be properly compressed and played? A video file, if it is compressed in one of the bright-chromatic schemes (and this is the vast majority of codecs, only RGB is used in uncompressed video), should be encoded in a narrow TV-range of brightness (16-235). Since the monitor still uses RGB output, the decoder must convert YCbCr to RGB with a full range of 0-255. This is almost a perfect scheme, but why almost? Here's why:

Artificial compression of ranges - out of 256 shades of gray we get 220 (out of 8 bits, in fact, we get a little more than 7 bits), and there are no objective reasons why you need to compress the range, except for the sake of compatibility. We deliberately degrade the quality of the picture.
Each pixel of the video passes from a file to a point on the monitor through a bunch of different filters (decoder, player program, rendering, video driver). Many transformations can be made along the way in this long chain, as a result of which quality is lost due to constant narrowing of the range.
Due to the fact that some filters ignore the narrowing of the ranges, and some videos are encoded erroneously in the full range, instead of the narrow one, other filters try to correct it, and as a result, there is even more confusion.

Experiment

I decided to conduct an experiment, test how different players / drivers behave when playing standard narrow-band (16-235) and for video encoded in full (0-255) range. For this, I took a PNG image with a gray gradient from 0-255, through AviSynth I gave it to the most popular and modern x264 encoder. I used three avs scripts, first read the picture and gave it "as is", in the RGB format (as an uncompressed video):

rgb.avs

 ImageReader ("palette.png", end = 24)

In the second and third files, I converted to YV12 color space in the full range according to two standards BT.601 and BT.709:

pc601.avs

 ImageReader ("palette.png", end = 24)
 ConvertToYV12 (matrix = "PC.601")

pc709.avs

 ImageReader ("palette.png", end = 24)
 ConvertToYV12 (matrix = "PC.709")

Next, I compressed several options. The fact is that in x264 there are two parameters that can affect the result: - input-range [TV, PC] and --range [TV, PC]. In older x264 versions, the --fullrange option was responsible for this.

colortest.cmd

 x264.exe --preset veryslow --crf 1 --output rgb.mp4 rgb.avs
 x264.exe --preset veryslow --crf 1 --input-range TV --range TV --output rgb-tv-tv.mp4 rgb.avs
 x264.exe --preset veryslow --crf 1 --input-range TV --range PC --output rgb-tv-pc.mp4 rgb.avs
 x264.exe --preset veryslow --crf 1 --input-range PC --range TV --output rgb-pc-tv.mp4 rgb.avs
 x264.exe --preset veryslow --crf 1 --input-range PC --range PC --output rgb-pc-pc.mp4 rgb.avs

 x264.exe --preset veryslow --crf 1 --output pc601.mp4 pc601.avs
 x264.exe --preset veryslow --crf 1 --input-range TV --range TV --output pc601-tv-tv.mp4 pc601.avs
 x264.exe --preset veryslow --crf 1 --input-range TV --range PC --output pc601-tv-pc.mp4 pc601.avs
 x264.exe --preset veryslow --crf 1 --input-range PC --range TV --output pc601-pc-tv.mp4 pc601.avs
 x264.exe --preset veryslow --crf 1 --input-range PC --range PC --output pc601-pc-pc.mp4 pc601.avs

 x264.exe --preset veryslow --crf 1 --output pc709.mp4 pc601.avs
 x264.exe --preset veryslow --crf 1 --input-range TV --range TV --output pc709-tv-tv.mp4 pc709.avs
 x264.exe --preset veryslow --crf 1 --input-range TV --range PC --output pc709-tv-pc.mp4 pc709.avs
 x264.exe --preset veryslow --crf 1 --input-range PC --range TV --output pc709-pc-tv.mp4 pc709.avs
 x264.exe --preset veryslow --crf 1 --input-range PC --range PC --output pc709-pc-pc.mp4 pc709.avs

As a result, received 15 files. After checking the histogram with AviSynth, I got that without specifying the --range and --input-range parameters, the video is compressed as it is being served, otherwise the ranges are converted using x264 means. That is, only pc601.mp4 and pc709.mp4 files provide a smooth histogram, but since these standards differ only in coefficients for chromatic channels, for our gray scale there will be no difference between them, I will only test two files - rgb.mp4 and pc601.mp4 (narrow and full range, respectively).

I tested the playback of these files on 4 computers, everywhere is Windows 7 and the ffdswow and K-Lite Codec Pack codecs. The result is in the plate:

table

Explanation of the table:
normal - correct display
in - incorrect display, range 16-235 not scaled to 0-255
out - incorrect display, the range 0-255 is scaled by mistake, resulting in a cropping range.

Here are screenshots of the results in the MPC-HC player:

It is worth noting that the settings of the ranges are in almost every filter. In my case, this setting is also in ffdshow video decoder, in the renderers Lav, Haali, in the settings of the video card driver, and you can also force the band to transform in the player (there is a special shader). However, for some reason range switching in ffdshow video decoder did not affect the result. Setting in the driver does not affect the result everywhere, where I influenced it, I put it in a table (the settings line of a video card). In addition, with DXVA and CUDA hardware acceleration, only the range of 16-235 is considered correct, not to mention TVs.

I also tested two video editors at the same time:

VirtualDub - the range does not touch, if there is no conversion from YV12 / YUV2 to RGB and vice versa, otherwise it does the conversion of ranges, the filters work in RGB, when displayed in the program does the conversion TV-> PC.
AviDemux - the range does not change (simply does not work with RGB sources), filters work with YUV2, when displayed in the program conversion TV-> PC.

Results

From the table it is clear that we end up with complete confusion: how the video will be displayed depends on many parameters, such as the codec used, the player, the rendering type and the driver. The main thing here is to understand what the actual problem is and, if possible, correct it.

The correct (although not quite) range for video compression is TV (16-235), otherwise in most cases the video will be displayed incorrectly (with cropping of the black and white range). And although from the point of view of digital video, it would be more logical to store and display the full range without any transformations, at the current stage there is such a standard, not complying with which, we get incorrect display in most devices.

How to deal with it? It immediately comes to mind - in the metatags indicate the range used, this method is even already implemented (there is a fullrange flag), but unfortunately, this flag is often ignored. Therefore:

developers - it is necessary to take care of the realization of the influence of such a flag, and also to make the correct display of video in their players / codecs, if band conversion is necessary;
users, if there is a problem with colors, try to adjust the settings (in the codec / player / driver), or try another player;
video handlers - know about the existence of such a problem, and correctly compress (encode in the range of 16-235 or at least specify the fullrange flag).

download archive with experiment

Links

Wikipedia YCbCr
Wikipedia Rec. 601
Wikipedia Rec. 709
Wikipedia 4: 2: 2
Wikipedia Color Downsampling

Source: https://habr.com/ru/post/136318/

All Articles