📜 ⬆️ ⬇️

Perspective video formats. New direction



In early September, Intel announced its participation in the organization Alliance for Open Media . With this step, we emphasize support for open formats and direct our efforts to create a new generation of video encoding tools. The Alliance for Open Media Consortium, which includes the corporations Amazon, Cisco, Google, Intel, Microsoft, Mozilla and Netflix, was created to jointly develop a new generation of video formats that reduce the cost of video delivery to end users by optimizing for new generation processors.

In this post we will analyze the current state of affairs in the field of video formats and consider the prospects for their development. The article was written by Mark Buxton, Intel Media Product Development Director.

To better understand the recent history of formats used for video broadcasting, let us recall what video broadcasting really is. Over the past 20 years, video broadcasting has gone from fixed channels to channel multiplexes and packet video streams (which are often transmitted over multi-level networks). All these models exist now simultaneously. In the last two models, scalable video was originally used, and now in some markets, real-time encoding is applied close to the network perimeter (for example, to adapt to channel conditions or client device capabilities). All of these coding models require different quality levels and different algorithms to balance the data rate and video quality.
')
In addition to this “last mile” of broadcasting, encoding occurs during video shooting, transmission and editing. The desire to achieve the highest possible video quality is not a short-term problem: the resolution of screens, their brightness and contrast are constantly increasing. The increase in resolution and color depth is usually associated with the need for compression. The transition to new formats, such as HEVC, allows you to bypass the bottlenecks in networks and storage systems and allows you to create and transmit high-quality broadcast video to viewers.

We turn to Moore's law and the microprocessor creation cycle. The computational complexity of our video formats is miraculously preserved at a stable level during the transition from generation to generation. But it turned out, in fact, by chance: the HEVC format is much more complicated compared to AVC, but the optimization of algorithms gives some effect.

During the time elapsed between the last two stages of video encoding formats (AVC-> HEVC), the number of processor cores, which is available at the same price, has significantly increased. The latest family of Intel Xeon E5 processors contains up to 18 cores per device (and when AVC appeared in 2003, Intel Xeon processors were single core). In the "density" of video encoding, there was an additional leap in the advent of the Intel Xeon E3 processor family and hardware-based video encoding components capable of providing the quality necessary for broadcasting. The hardware accelerators and software solutions used in client processors have evolved the Intel Quick Sync Video hardware blocks available through Intel Media Server Studio . When using them, the transcoding speed increases by 3 times with higher quality, if we compare the Intel Core i7-5850 processors using QSV, with the same processors using the x264 software implementation. Both our corporation and our customers in the media and broadcasting industries are using more and more formats - from obsolete MPEG-2, which are used for traditional set-top boxes, to the previous generation AVC and the latest VP9 and HEVC for the latest generation of TV, tablets, phones and entertainment devices.


Intel Core i7-4770 processor: a comparison of performance and quality for two families of video codecs

It turns out that previously the most resources in an ecosystem were spent on coding. But significant improvements in coding can contribute to changing business models. The most obvious advantage is that our customers will be able to take advantage of cheaper coding, improve efficiency and encode more materials.

The development of video encoding formats


The most effective video encoding format for today is HEVC. There are several ways to measure video coding efficiency. The BD-RATE method used on the vertical axis in the graph above is widely used. It allows you to reduce the data rate and video quality into one metric (since these two characteristics to a certain extent depend on each other) by comparing the curves formed by the quality and speed data with the gold format (WG11 HM14 is used as a reference coding) .

To assess the quality in this comparison, the metric Y-PSNR is used. Y-PSNR has long been considered quite adequate for evaluating video, but with the advent of the latest generation of video encoding formats, it has become less useful. Nevertheless, it is a very good format. You can achieve a very high quality video, close to "objective" results, if you solve problems with large blocks. It was developed in an open process, in which representatives from different countries, including educational institutions, government organizations and private companies, participated: hundreds of excellent professionals, among whom were several lawyers.

There is an alternative model in WebM. VP8 codec (the first among WebM codecs) was originally developed as a proprietary technology. It was acquired by Google, turned into an open and quickly adapted to stream video. Google provides industry free licenses for use, free open source software and even free hardware resources. The VP8 codec was not and is not a competitor of AVC and HEVC in video encoding efficiency for broadcasting, but it was deployed by a large number of clients with minor licensing restrictions. This format is most often used for video conferencing, which is well suited to it.

The VP9 format was recently developed as a replacement for VP8 with a similar (free) licensing model. VP9, like HEVC, is a good and modern video codec. Compare still frames in fig. 1, 2 and 3 below. I wanted to demonstrate the shortcomings of using outdated quality metrics, so I use one of the hardest videos for HEVC: crowd_run. This is a complex sequence, since it combines many types of movement, a huge amount of information and textures that cannot be packed into large blocks. Usually, on average, for a large volume of materials, HEVC produces a higher quality than VP9, ​​but in this case it is not. In this case, the benefits of VP9 are <visible>, as they say, with the naked eye.

As in HEVC, VP9 supports increased color depth, extended color gamut, high resolutions and a wide variety of applications. The quality of VP9 is much closer to HEVC than VP8 to AVC, and I can assume (since the format of VP9 is still relatively new) that in the future this lag in quality will be even smaller.


VP9 coding at a flow rate of 8.5 Mbps with –good –cpu-used = 0 parameters. Enlarged area of ​​the crowd-run image. Pay attention to the details of the trees. Very good results for a very complex sequence of frames (unfortunately, at this level of quality, the codec works two orders of magnitude slower than the others)


AVC coding at a flow rate of 12 Mbps with the –veryslow parameter. Enlarged area of ​​the standard test sequence of crowd-run images. Pay attention to how oiled the trees are. Despite this, the PSNR metric for AVC is 2 dB higher (!)


HEVC coding at a flow rate of 7.6 Mbit / s with the –TU4 parameter. Enlarged area of ​​the crowd-run image. There are fewer obvious coding artifacts than x264, at a much lower data rate, but the quality is lower than that of VP9. (According to objective indicators in this case, the data flow rate is 10% lower than when using VP9, ​​with the same Y-PSNR.) It is interesting that the software version runs twice as fast as AVC

However, Google did not act alone in this field. Other companies that needed to encode videos without paying license fees created new video formats. The most famous of them are: Daala from Xiph / Mozilla, Thor from Cisco, the AVS formats used in the PRC (v1 and v2) in China.

Both models are capable of creating video coding formats of equivalent technical quality. Why, then, did we join Alliance for Open Media?

We believe, like the other founders of this consortium, that the new format that follows HEVC and VP9 should not just go further along the path of improving the efficiency of video coding. We strive to create technologies that can meet the growing needs of the Internet for high-quality video, sound, images and multimedia streaming to all types of devices for all users around the world. As part of Alliance, we were able to combine the Thor, Daala and VP10 in a single homogeneous video format of the new generation, creating opportunities for the implementation of a variety of multimedia solutions.

What where When?


If you are hoping to get a new video codec by the end of the year, alas, its creation will take longer. We work quickly, but even the current generation of video formats is far superior to the previous one, we have invested a lot of effort and money in equipment, programs and tools for creating and distributing HEVC (and VP9). It will take considerable time to develop a new video format that can qualitatively overtake HEVC (so don’t wait for the results of our work to switch to our AVC codec << and get the benefits of HEVC >> ...).

We have no doubt that joint work in this direction will allow us to create an open source project in which new generation media formats, codecs and technologies will be created, in which the general public will be interested.

Join us!


Interested parties ask us how they can help. So, this is what I would like to receive from the wider community, even if you decide not to enter into direct relations with Alliance for Open Media.

Source: https://habr.com/ru/post/269825/


All Articles