Yandex is able to suggest colors by their name and find ones close to them. Some time ago, this hint (inside we call such things “sorcerer”) had to be redone to fit the type of search results after their redesign. And we took advantage of this opportunity to work on it seriously, as it turned out that arranging colors linearly is a very nontrivial task.





In this post, I want to tell you what an interesting algorithmic problem that demanded immersion
in color theory , we had to solve almost all of Yandex to make the new sorcerer the way the team conceived it.
')

The color maker appeared in Yandex search results in 2008. When he was conceived, we wanted him to become not only a toy that would reveal the color of a frightened nymph’s hip to a person, but also help him learn the color by the formula, learn the color notation, translate the code from one color space to another. The old version of the witcher consisted of a drum, a color picker, an input and 234 named colors.
At first it seemed that it could be revived on a renewed issue in a couple of weeks. The first part of this task was to bring the appearance of the witcher to a new style of the search results page.
It quickly became clear that it would not be possible to assemble it on standard elements. In addition, a witcher drawn for the old search results page would not fit into the new version, and to make a three-dimensional drum, according to our designers, is no longer fashionable. Thus was born a flat design. The main thing that I wanted to support in the new version and what was not there before was the viewing of custom colors that had no name. In the old version, the sorcerer answered all requests with the nearest named flowers and could mislead people. Now, any color can be seen in the environment of the nearest named neighbors or in another brightness.

The chief search designer,
kovchiy, presented the team with a file with thousands of named colors from a personal archive. As a result, we had 1010 colors, the names and coordinates of which were partially fixed in standard color lists (such as HTML Color Names, X11 Color Names), and partly invented on their own. And when they began to think how to put them into a sorcerer, the most difficult things began. The team that worked on it decided to even arrange a competition for the best sorting of flowers in the internal social network of Yandex. It was there that I learned about the task and offered my solution, which eventually ended up in production. But first things first.
So, the task was to arrange a set of colors on the drum (on the line or ring) so that:
- transitions between adjacent colors were smooth, and perhaps less contrasted, so that the adjacent colors were similar;
- similar colors are grouped so that similar colors stand side by side.
It looked like the original set of colors without a given orderBut already these two criteria are:
- redundant - a grouping of similar colors increases the contrast, and a decrease in contrast separates the groups;
- insufficient - they allow many different solutions, with the equality of which, according to these criteria, some stand out among other less specific qualities, such as better visible meaningfulness of the location as a whole;
- vague - they indicate what the desired solution should look like, but they do not allow us to distinguish which of the good solutions is better.
As a result, both judges and participants relied more on taste than on matching criteria when comparing decisions.
Manually, without the help of algorithms, it will not be possible not only to find a beautiful order of even a smaller number of colors, but also to improve someone found. If it seems that some color is the place in another part of the drum, then rearranging, you find a sharp contrast: the color perception depends on the background, and the new neighbors of the displaced color, which look so similar to the background of nearby colors, suddenly find a
very different shade after moving .
If you select the arrangement of colors algorithmically, it is natural to see the optimization problem in the problem and the first step to solving it is to choose the optimization goal - a function that tells you how good or bad this or that alignment of colors is.
To construct this function, an auxiliary one, called
ΔE , is needed, which specifies the number of difference or degree of similarity between two colors. That is, if and only if the difference of one pair of close colors is equal to the difference of the other pair, it should appear to the observer that the colors in each of the pairs are equally distant from each other. ΔE is one of the fundamental problems of the science of color, but within certain limits it is well solved. Fortunately, the function of the difference between colors has long been studied in the science of color, and now, where necessary, the formula published by the international lighting committee together with the
CIE L * a * b * (LAB) color space in 1976 (in which the difference between colors are simply Euclidean distance between them) called CIE ΔE and refined in 1994 and 2000. Contestants who did not know about CIE ΔE, usually defined ΔE themselves as Euclidean distance in RGB and HSV.
Using ΔE, we can construct a function of assessing the quality of the arrangement of colors on the drum as the sum of the squares of the differences between adjacent colors on the drum. The decision between the participants begins to differ with the choice of the function of the quality of the decision. All good solutions used CIE ΔE; among themselves they differ in the choice of the function of the quality of the solution and the algorithm for its optimization.
Why is CIE ΔE somewhat different, how do they differ in meaning? The CIE LAB color space was created so that the colors are evenly spaced. It separates the brightness from the other two coordinates of the color, which it chooses so that a change in color by several units of brightness is perceived as the difference when changing other coordinates by the same units. But the mathematical simplicity of functions expressing the transformation of physical space (
CIE XYZ ) into LAB, apparently, was of significant importance. Subsequent experiments showed that the perceived color difference is not quite, although very close, consistent with the LAB. That is, LAB is not completely homogeneous. But the found corrections to CIE ΔE 1976 cannot be included in the new color space - that is, the space in which the corrected CIE ΔE would be the Euclidean distance, like the old CIE ΔE in CIE LAB, is impossible. Therefore, CIE LAB 1967 is still used.

After we have chosen the color space, we can continue to choose the most natural, simplest solution quality function - this is the sum of squares between adjacent arrangement colors. Optimization of this function leads to the smoothest transitions between neighboring colors (if you use CIE ΔE), and this solution is now used in the updated color charts. However, it does not specifically take into account that it may be better to arrange a group of neighboring colors in a row than to use its part, smoothly switch to other shades and return to the first one more time. In addition, it may be desirable to switch from color to color “in one direction”: so that after dark red, after a slightly lighter red, there is an even lighter, not darker shade. To take these wishes into account, participants added to the simple quality function a sum of squares of differences to more distant neighbors, or to the average color of all the nearest neighbors. This improved the appearance of the arrangement as a whole, but unfortunately, it often overly degraded the close view — the five-color set that the charmer shows at a time.
After selecting the quality function, you need to decide how to optimize it. If this function looks like the sum of squared differences between the colors next to it, the optimization task is almost equivalent to the optimization of
the traveling salesman’s (TSP) path visiting all the sites once and trying to minimize the total path. TSP is an NP-complex, but well-studied problem, for which a public access to an approximate solution has several excellent solvers, primarily
LKH and
Concorde . Using a specialized solver allows you to get a solution closer to the optimal and faster than is possible with the use of universal optimization methods.
Such universal methods are required if a more complex distribution quality function is chosen, taking into account the difference not only between the nearest neighbors.
Sorting option using TSP by Metric CIE94 by ZubchickTwo participants used the annealing method. This method is a consistent solution optimization. First, we take some arbitrary decision and then try to gradually improve it. At each step, we choose some simple transformation of the previous solution. For example, rearrange a pair of random colors or neighboring colors, or flip a segment of flowers, or put it in another place. After such a permutation, the quality of the previous and next solutions is compared in terms of the quality of the decisions.


Several options for sorting by annealing method from AOrazaevIf this were not an annealing method, but a gradient descent method, he would always move from a solution with a lower quality to a solution with a better quality. But such a method leads to local optima when it is impossible to improve the solution by some simple transformation, but if everything is radically changed, one can come to a better result. Therefore, the annealing method sometimes worsens the solution. Due to this, it is possible to get rid of local optima in the search for a global optimum. The solutions were not bad, but not good enough near.
Two of our other colleagues used genetic algorithms and either did not get a good result or used an unsuccessful quality function. Apart from the
XtremAlRaven version, where the independently received TSP solution was taken as a starting point, but which could not be significantly improved.
An attempt to improve the modified TSP in the CIE1994 metric from XtremAlRaven with a genetic algorithmSo far, I have said that the result can only be obtained algorithmically, but in addition to choosing a specific algorithm, some manual intervention was possible. First, some participants tried to group the colors around several selected points (say, around black, red, green) and optimize all zones separately, and then glue them together.
I tried to place the colors by hand-selected groups (so that similar saturated colors do not mix with each other), find the boundary colors so that they smoothly merge into each other, and sort each group between the boundary colors. This method allows you to find a solution that looks better from afar, because it turns out less than the same color spots in different places, but in the witcher, when only 5 colors are shown, it looks a little worse.
In fact, the task seemed difficult, because before the competition was announced, the sorcerer’s developers had a question of how well to arrange the colors for a month and a half. In the end, they themselves found a good solution, comparable in quality to those who later participated in the competition. In addition, if you do not know in advance some elements of color theory, it is difficult to understand how to solve this problem well. My decision was ready on the third day of the competition, most others in the first week.
Total sortHere the difficulty arose in the team that dealt with the sorcerer - they had to choose one of almost forty.