On the relative brightness, or how tenacious is Legacy

I am sure that many programmers are familiar with the formula:

$Y = 0.299 R + 0.587 G + 0.114 B$

And the one who worked closely with graphics, knows these numbers literally by heart - as in the olden days enikeyschiki remembered the serials of Windows. Sometimes the coefficients are rounded to the second digit, sometimes they are refined to the fourth, but the canonical form is exactly that.

It calculates the relative brightness of the color (relative luminance or in some contexts luma; not to be confused with lightness and brightness) and is widely used to convert a color RGB image into a Grayscale and related tasks.
')
The formula is replicated and cited in thousands of articles, forum discussions and responses to StackOverflow ... But the fact is that its only correct place is in the history dump. It can not be used. However, use.

But why not? And where did these coefficients come from?

Mini-excursion into history

There is an international organization that develops recommendations (de facto standards) for the field of television and radio communications - ITU .

The parameters we are interested in are spelled out in the recommendations of ITU-R BT.601 , adopted in 1982 (following the updated edition link). Already at this moment you can be a bit surprised - where are we and where is the 82nd year? But this is only the beginning.

The tsiferki moved there from the recommendations of ITU-R BT.470 from 1970 (the updated version is also available by reference).

And they, in turn, are the legacy of the YIQ color model, which was developed for the North American television system NTSC in 1953! For today's computers and gadgets, it has a little more than no relation.

Doesn't anyone remind a bike about the connection of space ships with the width of an ancient Roman horse ass?

Modern colorimetric parameters began to crystallize in 1970 with the modernization of PAL / SECAM systems. Around the same time, the Americans came up with their SMPTE-C specification for similar phosphors, but NTSC switched to them only in 1987. I do not know for sure, but I suspect that the very fact of the birth of the notorious Rec.601 is explained precisely by this delay - after all, by and large, they are already morally outdated by the time of their appearance.

Then in 1990, new recommendations ITU-R BT.709 happened, and in 1996, based on them, the standard sRGB was invented, which captured the world and reigns (in the consumer sector) to this day. Alternatives to it exist, but they are all in demand in narrowly specific areas. And 20 years have passed, and no less, is it not time to finally get rid of atavisms?

So what exactly is the problem?

Someone might think that those coefficients reflect certain fundamental properties of human vision and therefore do not have a statute of limitations. This is not entirely true - among other things, the coefficients are tied to the technology of color reproduction .

Any RGB-space (and YIQ is the conversion over the RGB model) is determined by three basic parameters:

1. The chromatic coordinates of the three primary colors (they are called primaries);
2. Chromatic coordinates of the white point (white point or reference white);
3. Gamma Correction.

Chromatic coordinates are usually set in the CIE xyY system. The case of letters is important in this case: the string xy correspond to the coordinates on the chromatic diagram (the well-known “horseshoe”), and the capital Y is the brightness from the CIE XYZ vector.

Now let's look at the Y component of all primary NTSC colors (I marked them pink):

* Original spreadsheet with many other spaces on the Bruce Lindblum website .

Familiar tsifir, right? Here is the answer to the question "where did it come from?"

And the problem is that the sRGB space used today is significantly different from the system 60 years ago. And it's not even that of them is better or worse - they are just different :

The triangle is wider and shifted to the side. Another white point. By the way, the illuminator C has long been recognized as deprecated in favor of the illuminators of the D series in general and the most popular D65 in particular. The body of the color gamut is different - accordingly, the results of brightness calculations will be inadequate to reality.

You may ask: why the ancient NTSC coverage (almost the same as Adobe RGB 1998 coverage!) Is so much more than the modern sRGB? I dont know. Obviously, the kinescopes of the time could not cover it. Perhaps they wanted to make a foundation for the future?

How correct?

The relative brightness of the primary colors in the sRGB space is given in the table above (marked in green) - and they should be used. In practice, they usually round up to 4 characters:

$Y = 0.2126 R + 0.7152 G + 0.0722 B$

The attentive reader will notice that the coefficient for R is rounded not by the rules (down), but this is not an error. The fact is that the sum of all three numbers must be equal to one, and the “correct” rounding would introduce an error. Pedants can take all six decimal places and not worry.

This formula is enough for 99% of typical cases. It is used in all its W3C specifications (for example, matrix filters in SVG ).
If you need more precision, you have to calculate L * , but this is a separate big topic. A good answer to StackOverflow , which provides starting points for further reading.
If the image is in a different color space (Adobe RGB, ProPhoto RGB, etc.) - the coefficients will be yours; These can be found in the aforementioned Bruce Lindblum table.

Why do I care?

As mentioned above, for many years the formula is replicated on a myriad of sites, and they sit in the top of all search engines ( for example ). More serious sources often cite both formulas, but do not make the proper distinction between them, presenting them as equal alternatives. A typical example on StackOverflow: Formula to determine the brightness of RGB color - the answers are quite detailed, but it is difficult for a person not in the subject to make an informed choice.

To be fair, serious projects almost never suffer from such mistakes - the authors do not disdain to check the standards, and the feedback from the audience works (although it does not do without exceptions). But an ordinary programmer who needs to solve the problem as quickly as possible, drives something like “rgb to grayscale” into the search engine, and he slips it to him, you know what. The formula continues to find and copy-paste until now! Phenomenal vitality.

I spent about 20 minutes searching for these examples:

Microsoft documentation
Matlab documentation
standard go library
fashionable color pickers for React (> 2k stars)
Chart.js (> 23k stars!)
CImg image processing library
some modules in opencv (it looks like minor)
demo collection on webgl
something inside Gimp and utility to Inkscape
someone's small projects on Swift , Go , Three.js , JS

Please note that along with old projects there are a lot of references to the latest and most modern technologies in the list, that is, the code was written / copied quite recently.

And the reason for writing this note was a slide from Vasilika Klimova's report from HolyJS-2016 - with the same prehistoric formula. It is clear that the formula did not affect the main meaning of the speech, but it clearly demonstrated your chances of accidentally naguglit it in 2016.

Summing up: if you see the sequence 299/587/114 in someone's current code, throw the link to this article to the author.

update 1
The comments urge examples. But this is not as simple as it seems.
If you take an arbitrary picture and translate it into b / w in two ways - it will not give anything at all. Pictures will be just a little different. The viewer will only be able to assess which option is subjectively nicer to him. And that's not the point! The fact is, which option is more correct, more accurate .

A little thought, I sketched just such a thing: codepen.io/dom1n1k/pen/LZWjbj

The script generates 2 times 100 random colors, selecting the components so that the brightness of all the cubes is theoretically the same (Y = 0.5). That is, the entire field should be entirely subjectively perceived as uniformly as possible (uniformly precisely from the point of view of brightness, not taking into account different tones).
On the left, the old "wrong" formula, on the right the new "correct". On the right, homogeneity is indeed noticeably higher. Although not perfect, of course - for greater accuracy, it is necessary to calculate the perceptual lightness L *.

update 2.1
Another question arose about the scale. He has already been raised by at least 3 people, so I’m also putting him to the update. The question is in fact difficult and, in part, even philosophical (completely pulling on a separate article).

Strictly speaking, yes, to translate a picture into a b / w view, the gamma needs to be decoded. But in practice (in tasks not related to accurate colorimetry) this step is often omitted for the sake of simplicity and performance. For example, Photoshop, when converting to grayscale, takes into account the gamma, but the CSS filter of the same name ( MDN ) does not.

From the point of view of the correctness of the result, the choice of weights and recalculation of the gamma are complementary things. It affects both. The only difference is that gamma requires additional calculations, but the correct coefficients are free.

The second version of the demo with the gamma (the first did not go away): codepen.io/dom1n1k/pen/PzpEQX
It turned out, of course, more precisely.

Source: https://habr.com/ru/post/304210/

All Articles

On the relative brightness, or how tenacious is Legacy

Mini-excursion into history

So what exactly is the problem?

How correct?

Why do I care?

More articles: