A month ago, Google discovered the specifications of the VP8 format, which should become the main video format on the web. VP8 is free of patents, unlike H.264, and according to the developers, it must outperform its competitor in quality. A
promising schedule has been hanging on the On2 website for a long time. When the codec appeared in the public domain, I wondered if they had fulfilled the promise.
Those comparisons that appeared on the network after the release were rather superficial. Jason Garrett-Glaser, the developer of x264, is also preparing his subjective comparison of a large number of codecs, where VP8 will also be presented, but he has not yet published it.

')
So I undertook to make my own small objective comparison.
I started to make a comparison in the same week that VP8 was released, so all coders are about a month old.
Comparison participants
VP8
The reference coder, version dated May 18, 2010, was assembled in VS2008 in the Release configuration.
x264
A representative of the H.264 family will be the x264 version of r1602 codec. I think many have already heard about him: it is free, widely available and is one of the most advanced. Since VP8 is declared as a web-optimized codec, it is reasonable to compare it with the Baseline-profile x264. But I also measured the data on the x264 High-profile to get the full picture.
The main differences of the baseline profile are the absence of b-frames and arithmetic coding. VP8 has no B-frames either, but there is arithmetic coding. More details about H.264 profiles can be found
in Wikipedia .
libtheora
It is interesting to see how much we have played with another completely open codec. So in comparison, a new build of Ogg Theora is involved. Now there is a development of a new version of the coder, which was called Ptalarbvorm, and a demo appeared on the network. Build version ptalarbvorm-svn17230. MediaInfo recognizes it as libtheora 1.1+ 20100314 (Ptalarbvorm).
For decoding the results of Theors and x264, I used the AviSynth plugin FFMpegSource2 version 2-2.13. For decoding, the VP8 is a reference decoder.
Codec Presets
All codecs have an I-frame interval of 250. Settings are set for a more flexible rate control. For x264, the number of reference frames put three, since it is no longer supported by competitors. VP8 is configured in accordance with the recommendations from the site
webmproject.org .
x264 High Profile:--keyint 250 --bframes 4 --b-adapt 2 --b-pyramid normal --ref 3 --rc-lookahead 50 --no-psy --partitions all --8x8dct --direct auto --me umh --subme 8 --trellis 2 --no-fast-pskip
x264 Baseline Profile:--keyint 250 --bframes 0 --ref 3 --rc-lookahead 50 --no-psy --partitions all --direct auto --me umh --subme 8 --trellis 2 --no-fast-pskip --no-cabac --profile baseline
VP8:--good --end-usage=0 --undershoot-pct=100 -p 2 --kf-max-dist=250 --drop-frame=0 --resize-allowed=0 --static-thresh=0 --profile=0 --auto-alt-ref=1 --lag-in-frames=16
libtheora:--soft-target --two-pass -k 250 -z 0
Video sequences
For comparison, I took four video sequences, two of them in SD and two in HD.
Toys and calendar

640x352, 250 frames
Smooth movement, many details and complex textures. I liked this video in the previous comparison, I decided to use it in this too.
Big buck bunny

704x480, 926 frames
Fragment of the cartoon Big buck bunny, three-dimensional animation. I most often met this video on demos of various open codecs and could not pass by =) The source is anamorphic, that is, the proproctions are slightly distorted.
Battle

1280x544, 586 frames
Fragment from movie Terminator. Very dynamic video, a lot of movement, frequent scene changes.
Old town cross

1280x720,500 frames
Video with smooth motion. A large number of parts, but there is a uniform area (sky). There is also a small grain.
Comparison method
Instead of measuring on one bitrate, like last time, now I decided to make several measurements on different bitrates. And the data on all measurements can be presented on a single graph as a broken line. I selected the range of the bitrate for several samples based on the visual quality of the results.
Coding was carried out in two passes with a given bit rate. Bitrate is naturally variable. VP8 does not have a quality-based mode, if that.
For quality assessment, I used
the SSIM metric . As practice shows, it is closer to the results of visual comparison than PSNR. To measure SSIM used
MSU Video Quality Measurement Tool .
By the way, x264 has a parameter - tune ssim, which allows to slightly improve the SSIM, while PSNR. I did not include this parameter, since I initially wanted to count both metrics. But he did not, because the amount of work doubles, and the utility is small - the results are usually very similar.
The deviation of the bitrate at this time is taken into account due to the fact that the points on the graph were plotted in accordance with the actual shown bitrate. On some graphs you may notice that the points are not exactly on the mark.
results
VP8 was between two x264 profiles, and Theora was far behind them. So that you can see the difference in x264 and VP8 results on graphs, Teori had to leave only two or three measurements at the highest bitrates.
I give separate screenshots to the results with separate links, as a bunch of pictures will only litter the article. Yes, and consider them, switching between tabs, it is still more convenient.
Toys and calendar

Here, the x264 and VP8 results are visually slightly different. VP8 preserved textures better than x264 Baseline Profile, and x264 High Profile managed it even slightly better. At Theory's picture is very blurred.
Screenshots:
Source Theora VP8 x264 BP x264 HPBig buck bunny

Here, visually, the VP8 was not much better than the x264 BP. They have a different picture on different scenes. SSIM VP8 was still higher. x264 HP is noticeably better. Theor is noticeably worse.
An example of a typical situation:
Screenshots:
Source Theora VP8 x264 BP x264 HPAn example of a scene where VP8 is clearly visible from x264 BP.
Screenshots:
Source Theora VP8 x264 BP x264 HPVP8 also has problems with video bitrate distribution (albeit two pass encoding). For example, the scene itself is static, only the rabbit moved, for which it was very badly hurt. Here even Theora looks better. Such things are found as an exception, but they do.
Screenshots:
Source Theora VP8 x264 BP x264 HPBattle

Despite the higher SSIM performance of all codecs, the results on this video are visually worse than others. Basically, the VP8 picture is less blurred than that of the x264 BP.
Screenshots:
Source Theora VP8 x264 BP x264 HPAnd here's another example when VP8 badly fucked up the scene.
Screenshots:
Source Theora VP8 x264 BP x264 HPOld town cross

In this video, the visual differences are barely noticeable. The grain is washed away everywhere, the roof tiles are covered. The main differences in how roughly / accurately transferred small parts. Well, Theora blurred everything.
Screenshots:
Source Theora VP8 x264 BP x264 HPRelative bitrate
Figures and pictures are interesting and entertaining. But there is still a purely practical question: the ratio size / quality. There is an easy way to estimate it: cross the graphs of the results with a horizontal line fixing SSIM and see the bitrate at the intersection points. Yes, this is only a rough estimate, but it is quite informative. It reflects the picture of what is happening, and on other videos of the same type, the results will still be slightly different.
As the SSIM level on each chart, I took the highest x264 BP result. For 100% on the graph of the relative bitrate took the result of the VP8.

The graphs show that the bitrate (and, as a result, the file size) for x264 BP is greater than that of VP8, by 12-39 percent. At the same time on the "real" video with scene changes x264 BP lost less. Win x264 HP versus VP8 ranged from 16 to 33 percent.
I also note that here the bill goes on interest, and not on times, as was
the case with Theora .
Coding rate
I give the coding time of all the sequences on which the test was performed.
x264 High Profile: 20 minutes 12 seconds
x264 Baseline Profile: 10 minutes 21 seconds
VP8: 83 minutes 59 seconds
Theora: 21 minutes 59 seconds
Configuration: Intel Core2Duo T6670 2.2 GHz, 3 GB RAM
Here VP8 got a huge head start. I did not customize it for the rest, as this is an alpha version. True, the encoder has already been optimized, including through SIMD instructions.
The gap could be compensated by increasing the x264 parameters --ref, --bframes, --me-range and --subme. Thereby slightly improve the quality. But I left the settings that I usually use, for practical reasons. At the same time I received an approximate guideline for VP8 in time.
x264 worked in two threads, Theora can only in one, VP8 also worked in one thread, as the developers recommend the
= ( CPU – 1)
. But even with perfect multithreading, the VP8 would have been four times slower than x264 BP. Are you ready to wait an hour instead of fifteen minutes, or four days instead of one?
I do not know how to measure the decoding speed correctly. In a good way, you need to test several different decoders on several hardware configurations. Including on mobile devices. And this is a separate big job, for which technical resources are also needed. Perhaps such measurements will appear after the VP8 gets more widespread. In the meantime, there is only x264 developer data for which VP8 is noticeably inferior. And I believe his results more than the promises of On2 and Google marketers.
Thinking out loud
If we consider VP8 as a codec for the web, then it is quite a success. The quality of the results is comparable to the codecs of the H.264 standard, in terms of licenses and patents, everyone is satisfied, and thanks to Google it will receive quite wide support. Youtube is a great launch pad. Chrome, Opera, Firefox - this is a decent share of the browser market. Plus, Adobe promises to add WebM support to Flash by the end of the year, with an installation base of about 90%. And plus Chrome frame for users of IE. On mobile devices, too, at first through Flash, then there will be hardware support. It is not clear, however, what to expect from Apple with their iPhone.
On the technical side of this comparison, VP8’s problems with the rate control are visible. Spoiled scenes are a consequence of the improper distribution of bits throughout the video. Despite the fact that the video was encoded in two passes. And you need to significantly reduce the coding time. It is clear that this is only alpha, the codec will be refined.
As for patent purity, the question is still open. It will be visible in a year or two whether the claims will appear, and if they do, how they will be resolved. And I do not understand the general panic around the royalty-free H.264 until 2016. But MPEG-LA should extend it, why destroy a large user base. The development companies have enough money - due to the devices they sell.
By the way, VP8 is unlikely to move H.264 anywhere other than the web. In other areas, freedom from patents is not a decisive advantage, so users of torrents and the media industry are unlikely to change anything. Although at torrents.ru there were cranks who suggested that Teoro rip DVD rips, as they believed that it was superior in quality to H.264. Also, manufacturers of various video recording devices (video cameras, cameras, telephones, etc.) are unlikely to bother with the new format and, accordingly, hardware support for VP8 encoding. Digital TV is also unlikely to switch to VP8, otherwise everyone will have to change TVs (or STB), and television companies, in addition, will have coding equipment (also a matter of hardware support).
And although the H.264 standard was adopted in 2003, it continues to evolve. In 2007 and 2009, two extensions of the format were adopted:
Scalable Video Coding and
Multiview Video Coding . I'm not sure that we will see the wide use of the first one soon, but the second one may well go to the masses in the coming years, as 3D video becomes more and more common. And here VP8 also turns out to be a catch-up.
findings
VP8 format is quite suitable for the web. It is much better than Ogg Theora and is comparable to H.264 in size / quality. Questions about patent purity and decoding speed remain open. Whether the format will be distributed is a question of commercial and political success rather than technical success.
Materials
I post the original video and coding results.
Result VP8 I have without a container, raw stream. If someone correctly smoothes it into a matryoshka and shares it with people, it will be great.
Coding resultsSource video