Ffmpeg based video player

Hi, Habr!

This article will discuss the development of the simplest player using libraries from the FFmpeg project.
I did not find articles on this topic in Habré, so I decided to fill this gap.
Video decoding will be carried out using FFmpeg libraries, display - using SDL.

Introduction

With the help of FFmpeg, you can perform a large number of video processing tasks: encoding and decoding, multiplexing and demultiplexing. This greatly facilitates the development of multimedia applications.
')
One of the main problems, like most open source projects, is documentation. It is very small, and the one that is not always relevant, because This is a fast-paced project with ever-changing APIs. Therefore, the main source of documentation is the source code of the library itself. From old articles I advise you to read [1] and [2]. They give an idea of working with libraries in general.

FFmpeg is a set of utilities and libraries for working with various media formats. Probably, there is no sense in telling about utilities - you have heard everything about them, but you can dwell on the libraries in more detail.

libavutil - contains a set of auxiliary functions, which include random number generators, data structures, mathematical procedures, basic multimedia utilities and much more;
libavcodec - contains encoders and decoders for audio / video codecs (quickly pronounce these phrases ten times in a row);
libavformat - contains multiplexers and demultiplexers of multimedia containers;
libavdevice - contains input and output devices for capturing and rendering from common multimedia frameworks (Video4Linux, Video4Linux2, VfW, ALSA);
libavfilter - contains a set of filters for conversion;
libswscale - contains well-optimized functions for scaling images, converting color spaces and pixel formats;
libswresample - contains well-optimized functions for resampling audio and converting sample formats.

We will use SDL for video output. This is a convenient and cross-platform framework with a fairly simple API.

An experienced reader may notice that such a player already exists directly in the FFmpeg distribution, its code is available in the ffplay.c file, and it also uses SDL! But its code is rather difficult to understand for novice developers of FFmpeg and contains a lot of additional functionality.
Also, a similar player is described in [1], but functions are used there that are no longer in FFmpeg or they are declared obsolete.
I will try to give an example of a minimalist and understandable player using the current API. For simplicity, we will display only video, without sound.
So, let's begin.

Code

First of all, we include the necessary header files:

#include <stdio.h> #include <SDL.h> #include <libavcodec/avcodec.h> #include <libavformat/avformat.h> #include <libswscale/swscale.h>

In this small example, all the code will be in main.
First, we initialize the ffmpeg library using av_register_all () . During initialization, all file and codec formats in the library are registered. After that, they will be used automatically when opening files of this format and with these codecs.

 int main(int argc, char* argv[]) { if (argc < 2) { printf("Usage: %s filename\n", argv[0]); return 0; } // Register all available file formats and codecs av_register_all();

Now initialize the SDL. As an argument, the SDL_Init function takes a set of subsystems that should be initialized (logical OR is used to initialize several subsystems). In this example, we need only the video subsystem.

 int err; // Init SDL with video support err = SDL_Init(SDL_INIT_VIDEO); if (err < 0) { fprintf(stderr, "Unable to init SDL: %s\n", SDL_GetError()); return -1; }

Now we will open the input file. The file name is passed as the first argument on the command line.
The avformat_open_input function reads the file header and stores information about the found formats in the AVFormatContext structure. The remaining arguments can be set to NULL, in which case libavformat uses automatic parameter detection.

  // Open video file const char* filename = argv[1]; AVFormatContext* format_context = NULL; err = avformat_open_input(&format_context, filename, NULL, NULL); if (err < 0) { fprintf(stderr, "ffmpeg: Unable to open input file\n"); return -1; }

Because avformat_open_input reads only the file header, then the next step is to get information about the streams in the file. This is done by the avformat_find_stream function.

  // Retrieve stream information err = avformat_find_stream_info(format_context, NULL); if (err < 0) { fprintf(stderr, "ffmpeg: Unable to find stream info\n"); return -1; }

After that, format_context-> streams contains all existing file streams. Their number is format_context-> nb_streams .
You can display detailed information about the file and all streams using the av_dump_format function.

 // Dump information about file onto standard error av_dump_format(format_context, 0, argv[1], 0);

Now we get the number of the video stream in format_context-> streams . By this number, we will be able to get the codec context ( AVCodecContext ), and then it will be used to determine the type of package when reading the file.

  // Find the first video stream int video_stream; for (video_stream = 0; video_stream < format_context->nb_streams; ++video_stream) { if (format_context->streams[video_stream]->codec->codec_type == AVMEDIA_TYPE_VIDEO) { break; } } if (video_stream == format_context->nb_streams) { fprintf(stderr, "ffmpeg: Unable to find video stream\n"); return -1; }

Information about the codec in the stream is called “codec context” ( AVCodecContext ). Using this information, we can find the necessary codec ( AVCodec ) and open it.

  AVCodecContext* codec_context = format_context->streams[video_stream]->codec; AVCodec* codec = avcodec_find_decoder(codec_context->codec_id); err = avcodec_open2(codec_context, codec, NULL); if (err < 0) { fprintf(stderr, "ffmpeg: Unable to open codec\n"); return -1; }

It's time to prepare a window for video output using SDL (we know the size of the video). In general, we can create a window of any size, and then scale the video using libswscale. But for simplicity, let's make a window the size of a video.
In addition to the window itself, you must also add an overlay, in which our video will be displayed. SDL supports a large number of methods for drawing images on the screen and one specifically designed for displaying video - it is called the YUV overlay. YUV is a color space, like RGB. Y - represents the luminance component (luma), and U and V are the chroma components. This format is more complicated than RGB. some of the color information is discarded and there can be only one U and V sample for every 2 Y sample. The YUV overlay takes a YUV data array and displays it. It supports 4 different formats, but the fastest one is YV12. There is another format YUV - YUV420P. It is the same as YV12, except that the U and V arrays are swapped. FFmpeg can convert images to YUV420P, as well as most video streams are already contained in this format, or simply converted to it.
Thus, we will use the YV12 overlay from the SDL, convert the video in FFmpeg to the YUV420P format, and when displaying, rearrange the order of the U and V arrays.

  SDL_Surface* screen = SDL_SetVideoMode(codec_context->width, codec_context->height, 0, 0); if (screen == NULL) { fprintf(stderr, "Couldn't set video mode\n"); return -1; } SDL_Overlay* bmp = SDL_CreateYUVOverlay(codec_context->width, codec_context->height, SDL_YV12_OVERLAY, screen);

Conversion of pixel formats, as well as scaling in FFmpeg, is performed using libswscale.
Conversion is performed in two stages. In the first step, a transformation context ( struct SwsContext ) is created. Previously, a function with the clear name sws_getContext was used for this. But now it is deprecated, and the creation of the context is recommended to be done using sws_getCachedContext . We will use it.

  struct SwsContext* img_convert_context; img_convert_context = sws_getCachedContext(NULL, codec_context->width, codec_context->height, codec_context->pix_fmt, codec_context->width, codec_context->height, PIX_FMT_YUV420P, SWS_BICUBIC, NULL, NULL, NULL); if (img_convert_context == NULL) { fprintf(stderr, "Cannot initialize the conversion context\n"); return -1; }

Well, here we come to the most interesting part, namely the display of the video.
The data from the file is read by packages ( AVPacket ), and a frame ( AVFrame ) is used for display.
We are only interested in packets related to the video stream (remember, we saved the video stream number in the variable video_stream ).
The avcodec_decode_video2 function decodes a packet into a frame using the codec we received earlier ( codec_context ). The function sets a positive frame_finished value if the frame is decoded as a whole (that is, one frame can occupy several packets and frame_finished will be set only when decoding the last packet).

  AVFrame* frame = avcodec_alloc_frame(); AVPacket packet; while (av_read_frame(format_context, &packet) >= 0) { if (packet.stream_index == video_stream) { // Video stream packet int frame_finished; avcodec_decode_video2(codec_context, frame, &frame_finished, &packet); if (frame_finished) {

Now you need to prepare a picture for display in the window. First of all, we block our overlay, since we will write data to it. The video in the file can be in any format, and we configured the display for YV12. Libswscale comes to the rescue. Previously, we set up the img_convert_context conversion context . It's time to apply it. The main method of libswscale is of course sws_scale . It performs the required conversion. Note the inconsistency of the indices when assigning arrays. This is not a typo. As mentioned earlier, the YUV420P differs from the YV12 only in that the color components are in a different order. We set up libswscale to convert to YUV420P, and the SDL is waiting for us from YV12. Here we will make the substitution of U and V so that everything is correct.

  SDL_LockYUVOverlay(bmp); AVPicture pict; pict.data[0] = bmp->pixels[0]; pict.data[1] = bmp->pixels[2]; // it's because YV12 pict.data[2] = bmp->pixels[1]; pict.linesize[0] = bmp->pitches[0]; pict.linesize[1] = bmp->pitches[2]; pict.linesize[2] = bmp->pitches[1]; sws_scale(img_convert_context, frame->data, frame->linesize, 0, codec_context->height, pict.data, pict.linesize); SDL_UnlockYUVOverlay(bmp);

Display the image from the overlay in the window.

 SDL_Rect rect; rect.x = 0; rect.y = 0; rect.w = codec_context->width; rect.h = codec_context->height; SDL_DisplayYUVOverlay(bmp, &rect);

After processing the package, you must release the memory it occupies. This is done by the av_free_packet function.

  } } // Free the packet that was allocated by av_read_frame av_free_packet(&packet);

So that the OS does not consider our application to be hung, and to close the application when the window is closed, we process one SDL event at the end of the cycle.

 // Handling SDL events there SDL_Event event; if (SDL_PollEvent(&event)) { if (event.type == SDL_QUIT) { break; } } }

Well, now the standard procedure for cleaning all used resources.

 sws_freeContext(img_convert_context); // Free the YUV frame av_free(frame); // Close the codec avcodec_close(codec_context); // Close the video file avformat_close_input(&format_context); // Quit SDL SDL_Quit(); return 0; }

Go to the assembly. The easiest option using gcc looks like this:

 gcc player.c -o player -lavutil -lavformat -lavcodec -lswscale -lz -lbz2 `sdl-config --cflags --libs`

We start. And what do we see? The video is played with great speed! To be precise, the playback takes place at the speed of reading and decoding frames from a file. Really. We did not write a single line of code to control the frame rate. And this topic is for another article. In this code, a lot of things can be improved. For example, add sound playback, or render reading and displaying a file to other streams. If the Habrasoobshchestvu is interesting, I will tell about it in the following articles.

The entire source code.

Thank you all for your attention!

Continued: Finalization of the ffmpeg video player

Ffmpeg based video player

Introduction

Code

Links

More articles: