The distant 1988th year was full of amazing events. This year saw the light of the 4th album of the Metallica group “
... And justice for all ”, and the USSR launched the reusable spacecraft
Buran on its first and only flight. In the same year, the history of video compression began - the very first video codec standard appeared.
The most well-known video compression standards appeared thanks to two offices:
VCEG and
MPEG . One cannot call them competitors: some standards were issued by committees one by one, some became the fruit of their
forbidden love of collective work as part of united groups. Ironically, it is these “joint” formats that are most widely used.
1988 - H.261

So, 1988 year.
H.261 was the first full-fledged video compression format, widely used. It was a “classic” standard working in the YCbCr color space, based on discrete cosine block transform and Huffman compression. Raise your hand to those who heard about him? But it was in this standard that such concepts as macro-block, integer pixel motion vector and de-blocking (or post-processing) first appeared. And just then, 23 years ago, the concept of supporting frames appeared. H.261 provided frames of 2 types: I (ntra) - completely independent frame, and P (redicted) - frame dependent on the previous one. The maximum resolution of CIF (an example is shown on the left), supported by H.261, is not impressed even by fans of watching videos on the phone. And yet, for its time, it was a very progressive, very “advanced” standard. All subsequent video compression standards are based on ideas originating in H.261, and are de facto the result of its evolutionary development.
1993 - MPEG1
In 1993,
MPEG1 appeared. B (ipredicted) frames became a revolutionary innovation in MPEG1 format. Those. frames could now be predicted not only from the previous reference frame, but also the subsequent one. Half-pixel motion vectors appeared, which made it possible to increase the prediction accuracy and thereby improve the quality. The concept “slice” was introduced - a part of a frame (a group of macroblocks), which is encoded independently of other slices. It became possible to compress different parts of the frame with different parameters, but, most importantly, MPEG1 appeared to support very large resolutions, up to 4K to 4K.
For some reason, the MPEG committee threw out of the standard de-blocking phase. The committee was not even convinced of the significant quality improvement achieved by using de-blocking in the H.261 standard. Most likely, the decision was based on data on typical microprocessor performance of the time. Unlike H.261, the MPEG1 standard consisted of several parts describing everything needed for high-grade digital video: audio compression, video compression, storage and synchronization of audio-video data, compatibility testing tools, and a reference decoder for debugging.
')
In the early nineties at Intel, and indeed in the computer industry, there was hardly a complete understanding of the impact that video coding would have on the processor architecture in the future. This much later compression and decompression of digital video became the company's fad. In the meantime, in March 1993, one of the most famous processors from Intel, Pentium, began its long life. There was nothing special in it to speed up video processing, except perhaps the lonely bsr (bit scan reverse) instruction. This instruction has remained since the days of the 386th processor and could be used to speed up Huffman decoding. Pentium's performance was enough to quietly decode the H261 format. But no sound :). I hope some readers still remember how to hiccup winamp, if you move the mouse.
1996 - MPEG2
1996 Published
MPEG2 standard. Very soon DVD discs will be distributed around the planet with millions of copies, which will make MPEG2 the first widespread format for many years. MPEG2 practically didn’t bring anything new to the compression process, with the exception of interlaced video, support for several audio compression formats and additional color resolutions. MPEG2 was not optimized for use on small (less than 1 Mbps) streams. But on large MPEG2 streams confidently exceeded MPEG1, and the standard itself has grown to 11 parts.

In early 1997, Intel began selling processors that were already able to decode video at acceptable speeds. No, no one has dreamed about HDTV resolution yet, but the small QCIF video processor was already able to play without brakes. The “culprit” of this is the
MMX technology. Hardly the output of the MPEG2 standard and MMX technology with such a small time difference was pure coincidence. With great probability it was, as they say now, the product of synergy.
MMX technology consisted of a set of 57 additional instructions and 8 new 8-byte registers. Significant acceleration (up to 3–4 times) was achieved due to simultaneous processing by the instruction of several data. In this regard, digital video has become an ideal field for the introduction of new technology. Great hopes were pinned on MMX, and even placed on the official processor logo.
A little later in the same year, the
Pentium II processor came out, which, due to its superscalarity, large cache, a brisk bus and a new type of memory, made it possible to watch DVDs on a personal computer.
1998 - MPEG4
MPEG4 , which appeared in 1998, quickly gained fame as a “pirate” format. The DivX codec using MPEG4 format caused a real furore. DivX allowed with an acceptable loss of quality to compress an MPEG2 DVD disc into a file the size of a CD disc. I remember how many of my friends rushed to pinch DVD movies (where did they get them from ???) and make their own collection of DivX movies.
The success of the MPEG4 format consisted of several components: the motion vectors became quarter-pixel, which made it possible to increase the prediction accuracy, the macroblock could already contain up to 4 motion vectors, which was useful on the border of moving objects, and (fanfare!) De -blocking
The developers of the standard have added another interesting thing to MPEG4: intra-prediction. Now macroblocks in I-frames could be “predicted” from neighboring macroblocks, which significantly reduced the size of intra macroblocks in frames with a complex, but repetitive structure.
Unfortunately, the compression standard itself, or rather its excessive abilities, did not find a hot response in the face of codec manufacturers. Many progressive MPEG4 chips, such as 3D video textures, several video planes in the frame, and so on, remained unclaimed.

On the other hand, the greatly increased decoding complexity has again thrown users into the area of small resolutions. However, less than a year, Pentium III, the “accelerating Internet”, appeared on the market. By the way,
Pentium III coped well with the tasks of accelerating everything, not just the Internet. At that time, the experiments of launching the Quake 3 Arena game on a new processor, which after the system patch provided a significant increment of FPS, were popular. From a video encoding point of view, the processor brought software read-ahead (prefetch) data to the cache and expanded the MMX suite with several extremely useful instructions. And although the acceleration of video decoding was only 20-30% compared to the Pentium II, it was enough for comfortable viewing of MPEG4 movies.

Proponents of Intel products met the 2000th year with particular impatience. This year was the release of the new Intel
Pentium 4 processor . It was a great intrigue and a great mystery - the company was preparing to completely change the processor architecture. NetBurst architecture replaced the seemingly outdated P6 architecture. Although the overall processor performance slightly disappointed fans, in terms of digital video processing, the processor was at its best. New instructions and new 16-byte
SSE2 registers, tricky hardware prediction modes, large read / write buffer, new cache organization, and a little later,
HyperThreading technology . All this has breathed new life into the process of optimizing video codecs. The performance increase ranged from 10 to 35%. The Pentium 4 processor was free to experiment. For example, 2 instructions, swapped, could equally likely bring as a 5% increase in the speed of the codec, and 5% of the slowdown. The processor was quite enough for decoding both video and audio, and there was still a bit of performance for special effects. The DivX effects tab has grown and expanded, and the happy owners of top Pentium 4 versions put all the checkboxes in the hope of getting a “like in a movie theater” picture. And if we are talking about the cinema, then enthusiasts began to seriously look in the direction of HD resolutions.
Recent history, year 2003 - H.264
The year 2003 can be called the epochal year in the history of the development of video compression formats: the alpha and omega of today's digital video appeared - the
H.264 standard. The new standard was completely integer, i.e. All stages of video decoding were performed in integers, due to which the bit-by-bit video identity was achieved when decoded by decoders from different manufacturers.
From ancestors, H.264 was distinguished by advanced intra-prediction of macroblocks, different splitting of macroblocks during motion compensation (from 4x4 to 16x16), 6-point motion compensation filter, advanced arithmetic compression of entropy, the presence of long-stored reference frames, flexible control of reference frames, 16 vectors on a macroblock, with all available color resolutions, 8 bits or more per color component and many other magic chips. The standard not only left far behind all competitors, but also set new requirements for processor performance. Now, in order to play HD video, a single processor (even with HyperThreading technology) is no longer enough.
The period of 2003-2005 was difficult for users who lacked performance, but was a golden time for software optimizers. Their services were on the catch! CPU performance was clearly in short supply, and something had to be done about it. In May 2005, the decision came - for the first time since the Pentium III processor, multicore returned to user machines. The Pentium 4 processor, codenamed
Smithfield, proudly carried its 2 cores to the masses. In fact, Intel was cunning - it was 2 “almost” ordinary Pentium 4 processors located on the same substrate. Processors could communicate with each other exclusively via the FSB bus, they could not “peep” to the neighbor in the cache. However, Smithfield's performance was enough to give a smile to the faces of users again. Buy popcorn, take seats in the "visual" hall. Only thanks to multi-core in a long-term battle between processors and digital video formats, a turning point has been noted: processors have become able to decode digital video in any format, in any resolution at a comfortable speed for the viewer. But it was only a battle won, but not a battle.
As we know, digital video can (and should) not only decode, but first of all encode. But with this all was not as rosy as I would like. For full-fledged, fast and high-quality video compression of modern processors, the standard resolution of the MPEG2 / MPEG4 format was no more than that.

The town of Conroe in southeastern Texas was virtually unknown until the summer of 2006, when new Intel processors with the same core, or rather cores, began to be supplied on the shelves of stores. A distant descendant of the Pentium III Intel
Core 2 processor was designed to replace the processors based on NetBurst technology, and to consolidate the success of the video (de) coding. The processor had high-grade, high-performance cores that could efficiently climb each other in a large cache, and new
SSSE3 instructions (3 letters S). Among the new instructions were several video-coding-oriented ones. And although the new processors lost support for HyperThreading, they still had such impressive performance that compressing HD video in real time did not look like an impossible task.
However, as already noted, the emergence of multi-core brought victory in battle, but not in battle. In the fall of 2007, a unified group of committees strike back in the form of a new scalable compression profile to the H264
Scalable Video Coding (SVC) standard. The complexity of encoding and decoding increases significantly. It was not a full-fledged standard, but just a setting over the existing one, the main idea of which is a significant improvement in the quality of video transmitted over lossy networks. Now in the video stream the same movie could be stored in different resolutions, and higher resolutions used lower ones as reference ones. This solution had another additional advantage: now devices that did not need HD quality of the film could decode only part of the stream with the necessary resolution.
But nothing could change the situation in the opposite direction. At the beginning of 2008, Intel consolidated its success with a
Penryn processor with new
SSE4.1 instructions. As they will say later, this was the largest SIMD extension since the Pentium III processor. There were absolutely new instructions, sharpened on encoding digital video, and new extensions for existing SIMD instructions. The encoding of HD video in H264 format is already confidently moving in real time with acceptable quality.
Released in November 2009, a new profile for encoding video shot from several points for H264 format
Miltiview video coding (MVC) could not change anything. The new profile did not add anything new, just described the rules and ways of organizing the bitstream to compress the video taken from several cameras. Despite the fact that processor performance in 2009 was not enough to compress such a video in real time, it was a question of one, maximum two generations of processors.

This is what happened. A processor with the code name
Nehalem has entered the market, again giving us the joy of using HyperThreading technology. Among other advantages, the processor carried a memory controller and a ring bus, which was more suitable for communication between a large number of fast cores than the outdated FSB. The process of compressing an HD movie in excellent quality, which used to take all night, now flies by in a matter of hours. The spirit of victory was in the air. However, Intel would not be Intel if it had not set a beautiful final chord in this fight.

And it sounded: in January 2011, it was announced the beginning of sales of the processor on the core of
SandyBridge . Those who spit proprietary blue box packaging of the processor, probably noticed the words Intel Quick Sync among the list of features of the processor. This is the name of hardware video compression technology that is available to every user through the
Intel media SDK . Behind these three simple English words is the work of dozens of engineers and programmers of the company, including mine.
In 2002, I spoke with one of my colleagues about the optimization and multimedia orientation of modern processors (as you recall, the time of the domination of SSE2 technology). And to some of my arguments, a colleague answered that the processors will become multimedia-oriented only when the idct instruction appears there. Could we imagine back in 2002 that after only 9 years, life would justify much more ambitious plans and expectations?
Now video compression for your favorite iPad is not a problem. HD series from 24 episodes of 20 minutes each is encoded in 20 minutes. The film for 1.5 hours is encoded in 5 minutes. No more wasting your time and leaving the computer on at night. Just go and pour tea. Processors won.
PS: At this point I would like to finish my story, but that would be cunning. The air smells like a storm again. The point is that the combined committee group is developing the next standard for digital video compression -
High Efficiency Video Coding (HEVC). HEVC will carry in itself the best qualities of the H264 format, but at the same time it will have a huge amount of new features and capabilities that will place increased demands on the CPU. And the struggle between video compression standards and processors can reach a new level. And so on to infinity.
Upd. The penultimate paragraph dealt with recompression HD video for a particular gadget - iPad. I apologize for missing this moment.
Upd2. Habrovchane, we will be a little friendlier and more rational. I am sure that over time the software has evolved, and some of the negative effects described in the article have passed with time. For example, winamp stopped “hiccuping” on slow processors.