📜 ⬆️ ⬇️

Search in images - Google and not only

First, a few general words about how you can organize a search for images.
Ideally, we would like to have a system that can analyze the contents of the picture, determine if there is a house, lake or cat with kittens, and remember different characteristics of detected objects — such as color, size, mutual placement — and then search for information. But, unfortunately, today it is absolutely impossible. At a minimum, there is no method that would allow to reliably allocate objects of the real world in the pictures.
Therefore, any system is forced to analyze less intelligent signs, and these signs can be of several different types:

Boundaries
image
There are several border selection algorithms, they work quite well and reliably, the main problem with them is that after the selection it is still not clear what to do with them. You can calculate the relative area of ​​the borders - ideally, this may say, “pockmarked” is an image, or “smooth”, but in practice it turns out that this criterion works weakly. You can try to look at the Fourier transform of the image of the borders - this can tell us whether there are any pronounced periodic contours in the image. But still, such information very poorly describes how the picture will be perceived visually. Therefore, this class of signs can only serve as deeply auxiliary.

Texture
image
A texture is an area of ​​an image where there can be significant differences in brightness and color between adjacent points, but which is visually perceived as a homogeneous area (for example, grass, water surface, etc.). There are various methods that allow you to more or less well distinguish the boundaries between textures (some examples here: matlab.exponenta.ru/imageprocess/book2/55.php ), and although they are quite computationally expensive, they can still be used in practice. The question is rather different - what to do with this information? In fact, it can give approximately the same set of features as the boundaries - an idea of ​​how visually the image is - only, in a sense, more qualitatively. Accordingly, these signs can also be only secondary, although their meaning is somewhat greater.

SIFT-like signature
image
The SIFT method (scale-invariant feature transform) selects a set of control points in the image (roughly speaking, in places where the second derivative of the image reaches local maxima — but strictly speaking, it’s a bit more complicated), and uses the relative position of the control points as a picture characteristic. This method has several variations (primarily related to other ways of selecting control points). This group of methods is very good for finding out whether one image is a deformed copy of another - however, to determine the similarity of two fundamentally different, albeit visually similar, images, the method is unsuitable (for example, two photos of the same kitten in different poses will have little in common with such methods). Therefore, in the search system, it can play the role of only determining whether there is a modification of the desired image in the database, but cannot find similar ones in any other sense.
')
Colour
image
The color is perhaps the most significant characteristic for analysis: first, it is a visually very important feature. And secondly, it is easy enough to walk through the image and find out exactly which colors are most often found there. Moreover, it turns out that for most real images of primary colors usually no more than 6-7, and often even 3-4. This is essential - because with such a small number of signs, search in the database of images can be implemented even faster than searching in the database of html-pages.

Meta-information
And, of course, one should not forget that most of the images on the web are not just like that - they are organized either in photo albums with names and comments, or they are illustrations for some texts, or even have detailed signatures and even ready-made tags. A full-fledged search system, of course, must squeeze the most out of this information in order to thematically structure the base of images - since this cannot be done by analyzing the picture itself, one should strive for this in detours.

Now about google
Analyzing Google’s search itself, it’s easy to find out what the first thing he is trying to do is to find a similar image in his database of “popular images” (for example, Wikipedia illustrations are included). Apparently, some SIFT-like method is used - because my attempts to confuse it with color correction did not give any results (up to the replacement of blue-> green, and the other colors are similar), however, the image with the changed proportions, compressed horizontally by only 20%, this method did not recognize.
If the image is found in the database of popular - then Google raises its context, and gives similar ones based on this context.

The most interesting begins if the image was not “recognized”. In this case, Google offers a set of "visually similar" pictures - and it is easy to find out experimentally that the key feature of the similarity among them is just the set of colors present in the image, taking into account the area they occupy - perhaps supplemented by some information about the texture or borders, but the key sign is definitely color.

Source: https://habr.com/ru/post/126136/


All Articles