Testing video codecs. Episode I: the hidden problem

Do you remember the story about the development of video compression formats ( this one )?
And how many of the codecs described there do you know personally? And what tried to write yourself? What compression algorithms are most effective?
These and other questions will NOT be covered in this article.

Part zero, familiarity

Good afternoon, dear community!
My name is Andrew, I work at Intel. In several articles I plan to talk about what approaches we use when testing video codecs in the Intel® Media SDK project.

Part one, introductory

Not so long ago, my colleague, Dmitry Serkin, wrote an article about assessing the quality of video codecs ( link to ISN ). There he casually mentioned the validation process. The purpose of this article is to shed some light on this activity.
')
Intel® Media SDK allows programmers to use the multimedia capabilities of modern processors (Sandy Bridge, Ivy Bridge) to perform decoding (H.264 (AVC, MVC), MPEG2, VC1, JPEG / MJPEG), encoding (H.264 (AVC, MVC ), MPEG2) and some preprocessing operations (for example, resizing or deleting interlaced scanning). But the most important thing is that all these actions can be combined to build a transcoder chain (transformation of one format into another, without intermediate stages). This will speed up the transfer of Full HD video (1920x1080, High Definition) to a phone that supports, for example, no more than SD resolution (720x576, Standard Definition).

And now let's look at all this from the point of view of testing: 5 decoders (decoders), 3 encoders (preprocessing), and preprocessing operations, plus the options for combining them. Do not forget to add that for all components the content is also important - the resolution of the video sequence, the parameters of the encoded stream (for decoders) and the parameters that we want to use (for encoders). We get a lot of options for checks. A huge set. This is where those days come from for validating on far from the weakest machines that Dmitri spoke about.

Yes, one more thing. The library has two implementations: one is “hardware”, which uses the capabilities of the processor and the second is “software”, for cases when you don't have a piece of hardware, but you already want to practice. Therefore, all those iterations that we counted in the previous case, safely multiply by two. Naturally, the operating systems also differ: Win 7 (x86 and x64) and even those that we will not talk about;)

Fear has big eyes, but you have to do it. To begin with, let's sort out that chaos of possible test scenarios. First of all, we separate the SDK components from each other: there are decoders, there are encoders and there are preprocessing operations. Now we cover each part with the maximum number of meaningful tests. And then we reassemble everything into a single whole - the transcoder chain - and repeat the part of the tests where the interaction of the components matters (the definition of such tests is a separate and very difficult task). Now consider all the details. Let's start with decoders.

Decoders

Here, everything is relatively simple: there is a certain set (conformance set) of encoded video sequences, which our codec implementation must decode correctly, since we declare its support. But the difficulty is that this set is not always great, and the options that can be found in reality are much more.

nothing can prevent us from creating a multitude of test sequences on our own - using the appropriate encoder that is at hand

At the same time playing a movie is not worth it.

mathematical induction in action

Although make sure that the decoder is able to work not 3-10 seconds, and 1.5-3 hours also makes sense, of course.

But we check not only the compliance of codecs with the standard. We have the same SDK! And it allows us to decode using both system memory and d3d, for example. Yes, and various asynchronous supported.

in this case, it is a “distant synonym” of multi-threading: several processing tasks can be executed simultaneously (someone decodes, someone preprocesses, and someone already encodes)

There are some more different modes related to the details of the library implementation. And here we again have many, many options.

After decoding, we have a decompressed video sequence, which would be nice to evaluate the quality. But how? The very PSNR (Peak Signal to Noise Ratio, wiki )? Computationally heavier SSIM (Structural Similarity, wiki )?

Lyrical digression about metrics: it is very difficult to conduct a subjective analysis for each decoded sequence - the eyes become blurred, they see differently, and where can we find so many eyes? Therefore, objective metrics are used, i.e. attracting maths. But even though these metrics carry such a loud name, there are cases that the metric says one thing, and the eyes completely different.
As for PSNR with SSIM, these are the most common (read, simple and convenient) quality assessment metrics that allow you to assess the degree of similarity of two decoded images
Well, what to compare with? Who is the reference? Fortunately, not all codecs are equally flawed.
Speech about the error of calculations - codecs containing non-integer calculations can give different results from launch to launch

So, for example, H.264 \ AVC is an integer standard, which guarantees the same result of decoding a sequence, regardless of anything. This is wonderful! We decode the sequence with a reference decoder (if not, then the reference will have to be assigned), we believe that it is correct (in this case, you need to look through the entire stream), and save it.

storing the entire decompressed sequence is expensive - takes up too much space. Therefore, you can calculate its checksum, for example, CRC32

On the next run, simply compare the current result with the benchmark. Matched - everything is fine, no - hmm, something went wrong.

If the standard is not integer, the procedure is a bit more complicated: we choose a reference decoder, and for comparison we use the PSNR metric to evaluate how good (or bad) everything is. When the decoder is stabilized, repeat the procedure described above.

because if the output turned out a little different, then something in the decoder has changed, it means. And this problem can be :)

That's all about decoders we finished.

Source: https://habr.com/ru/post/142187/

All Articles

Testing video codecs. Episode I: the hidden problem

Part zero, familiarity

Part one, introductory

Decoders

More articles: