Good afternoon, dear users. Some time ago, I had to work on a simple windows-based application that required audio and video recording from various devices. In particular, audio capture was needed from six channels of the MAudio card, and hd video capture was done from two AverMedia capture cards, the signal to which came from the video cameras via the component input. It was also necessary to take screenshots from a document camera connected via a USB interface. The application was decided to write in C #, and the video produced using the library DirectShow.NET.
Based on the solution to this problem, an idea arose to write an article and share experience regarding video capture. Maybe someone this information will be useful. Who cares - I ask under the cat.
Instead of the preface.
Although
MediaFoundation is now increasingly being used to perform such tasks, this platform, in my opinion, is still not widespread, even taking into account the fact that in new versions of Windows, starting from 8th, Microsoft will gradually refuse to use and support DirectShow. There are various computer vision libraries that support video recording, such as OpenCV or AForge, but with simple video capture, their powerful processing functionality is not very necessary, and inside such libraries can often use DirectShow.
')
There are quite a lot of articles and materials on the Internet about what
DirectShow is and how it works, and the information in
this article also jumped on Habré, so I will try not to flaunt with terms that I may not even know myself, but I’ll consider everything from the practical side How can a person, who has not been familiar with Directshow before, be able to write his video recording application in C #, where to start and where to go, and also talk about the problems that have been encountered.
For an example (
see the code on GitHub ) for this article I will use a simple usb capture card EasyCap:

1. Where to start. Requirements, tools and information
The tools you need are:
1)
K-Lite Codec Pack Mega and GraphStudio tool - for quick prototyping of the video capture graph.

2)
GraphEditPlus is the commercial equivalent of GraphStudio, which allows you to generate code. designed graph in C ++ and C # languages. A 30-day trial version is available, the limitation of which is that the generated code cannot be copied to the clipboard.
3) C # development environment - in my case it will be Visual Studio 12.
4)
DirectShow.net library.
5)
WindowsMediaLib library.
Unfortunately, the Internet did not manage to find a solid and structured guide on how to make a video recording application, but some pages provided a truly invaluable help, first of all:
1)
A small page , information from which has become a catalyst for the whole process. Also there you can find clear descriptions of classes and interfaces of DirectShow.net. Very useful page and many thanks to its author.
2) Open source code like
this , which helped to deal with crossbars and other issues.
3) MSDN, in which there is a whole
section dedicated to programming DirectShow.
2. Filters, creating a video graph and visual editor
DirectShow graphs are built from filters that are connected to each other by input and output pins.
More on this
here .
For the simplest case in GraphStudio, you can construct a graph, for example, for an integrated video camera as follows:

But for our task several filters are required, and the graph (taking into account the entry in the WMV format) should look like this:

This column contains filters:
SMI Grabber Device (group of filters - WDM Streaming Capture Devices) is a filter that represents a capture device; it is from it that we receive video (and also audio) streams. But in this case, not the audio stream coming from the capture device is recorded, but the stream from the microphone (Filter
"Microphone ..." from the Audio Capture Sources group).
SM BDA Crossbar Filter - the crossbar filter for the capture device; it is its setting that determines the switching of the input signal, whether it comes from the SVideo input or from the composite input.
Smart Tee - a splitter that has two outputs, the stream from the Capture output goes to write to the file, and
The stream from the Preview output goes to the preview window through the
AVI Decompressor filter. It should be noted that the chain
AVI Decompressor ->
Video Renderer is created automatically when you select the option Preview -> Render Pin.
(note that there are different types of renderer filters, and Enhanced Video Renderer is one of the most advanced ones, but in this example the default filter is used)
WM ASF Writer - a filter that provides the easiest way to record video of the required quality in WMV format. It is possible to change the quality of the recording, including the user one.
By running this graph, you can verify the correctness of the video source recording.
3. DirectShow.net library and graph transfer to code
3.1. Code Generation in GraphEditPlus
The next step is to convert the resulting graph into a code. In this case, the GraphEditPlus tool is invaluable. In general, this graph editor is more convenient than GraphStudio from the K-Lite suite, but its most important feature is the ability to generate code using the constructed graph:

Unfortunately , this tool
cannot customize the setting code of certain filters, such as Crossbar or WM ASF Writer, but it is invaluable as a first step.
3.2. Video application
I repeat, the code of a simple application written specifically for this article can be viewed and downloaded
here . I apologize in advance for its non-optimality and violation of SOLID, since this is just a test case.
In this application, the main operations on the graph (creation, destruction, pin search, crossbar routing, start, stop, and others) are defined in the
VideoCapturebase abstract class, and
inheritance classes such as
VideoCapturePreview ,
VideoCaptureAsfRecord or
VideoCaptureScreenshots implement an abstract method for constructing a graph from
BuildGraph()
filters
BuildGraph()
, adding new filters to the chain. The
ControlVideoCommon class contains window creation operations and bindings of a graph to it, an operation to stop and destroy a graph, as well as several other utilitarian operations.
3.3. Not always obvious moments
3.3.1. Adding devices
If there are several devices of the same type (several identical capture cards, for example),
they will have the same guid, but different DisplayName parameters. In this case, you need to find all the devices with the following code:
private readonly List<DsDevice> _captures = new List<DsDevice>();
Further, when creating graphs, the
devicePath1
and
devicePath2
obtained by this method are already used.
3.3.2. Crossbar routing
A video capture device may or may not have a crossbar for using different types of video inputs (for example, AverMedia and EasyCap of this example have, but the built-in webcam or BlackMagic capture card is not). Therefore, it is necessary that the binding to the crossbar is done automatically.
To do this, the base class defines the method
FixCrossbarRouting(ref IBaseFilter captureFilter, PhysicalConnectorType? physicalConnectorType)
, which searches for and connects the crossbar (if available) with switching to the required input type:
3.3.3. Resource Release
If you do not release the resources of the created graph when it is destroyed, then creating another instance of the graph using the same filters as in the first one will fail, therefore, you must call the
DisposeFilters()
method, which removes the filters from the deleted graph. After some experiments, the following code worked fine.
if (Graph == null) return; IEnumFilters ef; var f = new IBaseFilter[1]; int hr = Graph.EnumFilters(out ef); if (hr == 0) { while (0 == ef.Next(1, f, IntPtr.Zero)) { Graph.RemoveFilter(f[0]); ef.Reset(); } } Graph = null;
3.3.4. Flow configuration (frame rate, resolution, etc.)
Video capture devices can provide different configurations of the video stream, between which you can switch. For example, a hd-camera can produce as a 640 by 480 image at a frequency of 60 frames per second, or an hd-quality image with a frame rate of 30 frames per second. For the frame rate, there are even fractional digits like 29.97 frames per second. To configure such parameters, you need to create a
streamConfigObject
object using the
FindInterface
method of the
FindInterface
interface, bring it to the
IAMStreamConfig
interface, call the
GetFormat
method to get an object of the
AMMEdiaType
type, get the header:
var infoHeader = (VideoInfoHeader)Marshal.PtrToStructure(mediaType.formatPtr, typeof(VideoInfoHeader));
and further perform operations on its parameters
AvgTimePerFrame,
BmiHeader.Width,
BmiHeader.Height
and others.
In the code, this can be seen in the
ConfigureResolution
and
ConfigureFramerate
methods of the
VideoCaptureAsfRecord
class.
3.3.5. Screenshots
In order to be able to take screenshots from a video stream, you must inherit the class in which the graph is built (VideoCaptureScreenshots) from
ISampleGrabberCB
, and override two methods -
BufferCB
and
SampleCB
.
SampleCB
may be empty, and in
BufferCB
copy the resulting array:
if ((pBuffer != IntPtr.Zero) && (bufferLen > 1000) && (bufferLen <= _savedArray.Length)) { Marshal.Copy(pBuffer, _savedArray, 0, bufferLen); }
and also call the handler:
_invokerControl.BeginInvoke(new CaptureDone(OnCaptureDone))
in which you should call the method
SetCallback
SamlpleGrabber
'a
_iSampleGrabber.SetCallback(null, 0);
In the
BuildGraph
method,
BuildGraph
you turn on the
SampleGrabber
filter in the chain, you should configure it, and make the additional setting after adding the other filters (magic, but does not work otherwise). In the test case, the
ConfigureSampleGrabberInitial()
and
ConfigureSampleGrabberFinal()
methods are responsible for this. During the initial setup, the
AMMEdiaType
is determined, and during the final setting, the
VideoInfoHeader
is set and two
ISampleGrabber
methods are
ISampleGrabber
:
SetBufferSamples(false)
and
SetOneShot(false)
.
The first is necessary in order to disable the buffering of samples passing through the filter, and the second in order for the callback of the screen capture function to be pulled several times.
3.3.6. Wmv format, .prx and WindowsMediaLib files
In order to ensure an acceptable recording quality, it is necessary to override the wmv-file recording settings.
The easiest way to do this is by creating a custom profile file with the .prx extension and redefining the parameters responsible for stream quality in it. An example of this file in the code is
good.prxTo read the profile files and create profiles for them in the
ConfigProfileFromFile(IBaseFilter asfWriter, string filename)
method
ConfigProfileFromFile(IBaseFilter asfWriter, string filename)
, the WMLib class from the Team MediaPortal project distributed under the GPL license was used. After creation, the profile is applied to ASF Writer using the
ConfigureFilterUsingProfile(wmProfile)
method of the
ConfigureFilterUsingProfile(wmProfile)
interface.
Instead of an epilogue or Big Problem, which had to face
Mpeg4Layer3, Codecs, AVIMux and audio and video synchronization
At the very beginning of the development of the application, the idea was to record video in Mpeg4 format, and sound in Layer3 format, combining all this using AVI MUX into a single file, as in the following graph:

where in place of the XVid Codec filter could be any filter from video compressors in Mpeg-4. There were attempts to use both xvid and ffdshow, and some other filters, however, after several attempts to force the graph to record video, it became clear that not everything is as simple as it seems at first glance. There was a problem with a record break some time after it began. The reason here, apparently, lies in the fact that when mixing video and audio in the AVI MUX container, the video and audio tracks are not automatically synchronized, and even with adjusting the correct frequency, the graph could stop at a random moment, while the recording was interrupted, and during playback notice that audio and video are out of sync.
Unfortunately, I can’t tell you about solving this problem, because I had to deal with it in a radical way - by transferring to writing to the wmv format using the ASF Writer.
If this article is read by someone who has encountered this problem and is familiar with it, I will be glad to advise.
Thank you very much for your attention and interest, I hope this article was not deadly boring, and I also hope that this material can bring practical benefits to someone.