📜 ⬆️ ⬇️

Licenzero: looking for skin color porn

We continue the description of the classifier of pornographic video content developed by Inventos (Licenzero, which is present in the title is not a separate company, but a division in the company Inventos).

The skin color detector is one of the detectors with which we classify video. It is not as complicated as a motion detector , or a fragment detector, one might even say quite simple. Initially, we had a bunch of ideas related to the color of the skin in the video. But having tried the simplest approach to classification, we decided (perhaps temporarily) to dwell on it, because we were completely satisfied with the results. So.

Definition of skin color


There were two tasks before us:

So let's start with the color of the skin. First of all, we scored a few thousand pictures. Just as pictures, and frames from the video, including pornographic, because we are primarily interested in just such a video.

Then using a simple homemade program:
Skin selector
We marked on the pictures area with skin and without. Thus, we obtained the coordinates (in RGB) of several million points, classified according to the origin of these coordinates to human skin.
')
Then there was the problem of choosing a color model, in which we will consider the coordinates of points. Choose between RGB, LAB, HSV and YbCr. We carried out several tests and decided to dwell on YbCr, not least because, since we had to classify it exactly by color, it was possible to discard the gray component of brightness Y.

Here are the points from our pictures on the b scale:
Cb
And on the scale of Sr:
Cr
That is, it is noticeable that skin pixels can be distinguished by these two coordinates. This is how the probability (according to our data) that a certain point with coordinates Cb and Cr is skin looks like:
CbCr
Where blue is a probability equal to 0%, where red is 100%. This slide tells us that we have chosen, for example, 50% as a threshold for classifying skin, we can easily separate the light of the skin (in Cb and Cr coordinates) from all other color.

We decided not to use SVM for classification purposes, but simply to define a rectangular area, optimally classify skin pixels. That is, such a pseudo-SVM with four support vectors. That turned out to be a rectangle:

The black line is our rectangle. The green line - the probability that the point on this curve refers to the skin is 50% (red: 90%, blue: 10%). That is, all points with coordinates Cb and Cr that fall inside the black rectangle are skin pixels.
Here is an example of the definition of skin by our system:

On this rectangle, we suspended our research, because all this is of course interesting, but we need to move on - to classify pornography by color.

Skin color classification


So, we have decided on what we will consider skin. Since we are working with video data, we moved from the YCbCr color space to YUV (in fact, this is the same thing, the Russian Wikipedia even redirects from the YCbCr page to the YUV page ). C YUV work in this case is extremely convenient. Not only do we not re-encode the raw video, but the steam (U, V) we get is two times less than the points in the frame (if the video is in yuv420p format), in general, it’s a complete saving.

But what about the classification? With the classification, everything was even easier than with the definition of skin color. We thought: what will happen if we count the proportion of skin color in porn and irrepressibly videos (that is, the number of skin pixels is divided by the total number of pixels in the video). It turned out this picture:

These are histograms of distribution of rollers. Y-axis - the number of rollers, X-axis - the proportion of skin pixels in the video. Dotted lines show distribution density graphs, if we assume that the distribution in both cases is normal, but this is so, just for illustration.

If we measure the proportion of skin pixels not in whole videos, but in porn and irreproachably fragments, we also have such fragments - we manually cut them when the motion detector was done, then the results will be even better.

Our skin color detector returns the likelihood (like the rest of our detectors) that a fragment is pornographic. And this probability is simply a function of the proportion of skin pixels in the fragment. The function is approximately as follows:

So, in order to classify some fragment of some video, we:

Source: https://habr.com/ru/post/117040/


All Articles