Microphones, screenshots and local video: how the Voximplant Web SDK manages media in a browser
Today I will tell you about the Hardware module in the Voximplant Web SDK. This module replaces the old audio and video device management system. But first, a little about managing devices in the WebRTC stack and why all this is needed.
Rarely, but end users have multiple microphones. Or there are several speakers: for example, ordinary speakers and bluetooth headphones; or phone speaker and speakerphone for smartphone.
But two cameras in the modern world are very common. On tablets and phones: regular and frontal. On laptops: built-in and better external. And so on and so forth. The user can be very upset if the wrong set of devices is selected by default and he cannot change them in any way. This is the first use of media management in WebRTC. ')
The second application is the fine tuning of audio and video. Any echo cancellation, noise reduction, setting video resolution, frame rate, white balance and other things that specifically your browser supports.
Previously, for device management, we provided the interface constraints and the developer needed to prescribe all the limitations and parameters of the media query on their own. Here is a small example of a fully customized audio and video request:
This gives incredible flexibility in setting up audio and video, however there is always “BUT”. As the statistics of the support survey showed, the customers basically did not use this option, since we are a friendly platform and we teach the users to be simple. Those who ventured to use this, often shot themselves all just above the knee, since browsers have a slight incompatibility at this point in space.
Now it is clear how we came to such a life. It's time to get to the point: briefly examine the module; for detailed reading, I will leave a link to the documentation on our website .
The entry point to this module is StreamManager . Historically [a picture with an elephant.jpg], that many parts of our Web SDK are singletones, and this class is no exception: you can get an instance using the get () function. For this there is a sea of ​​reasons, but about this another time, perhaps.
With the help of StreamManager you can enable or disable local video and get mediastrims for him - in fact, that's all. The entry point is not very big, if you look at the external interfaces.
An interesting fact about local video: local video is not transmitted to the interlocutor. For example, the user can show the picture in HD, and transfer 320 * 240 or vice versa, if the local video is hidden in the corner. Below is about how to turn this trick.
There are 3 events in StreamManager:
DevicesUpdated - called when the user connects / disconnects the microphone or camera;
MediaRendererAdded - a new local video or screenshot with a preview has been added;
Using the getLocalMediaRenderers () function, it is useful to get references to DOM elements from a local video if you have not saved the object to yourself.
What settings can we set at all? Let's look at the CameraParams interface:
strict - true will tell the browser literally: “I know what I’m doing, don’t try to fix something if everything went wrong!”. If false , the browser will humbly correct the setting curves;
frameRate is responsible directly for the frame rate. This option helps to either save traffic or increase quality. I would not recommend setting this value above 30 and below 10 without a preliminary test for each user and fallback. Outside this range, not all cameras work.
facingMode - true flip image horizontally. The option is quite niche, because To output local video, we already do a flip picture.
3 more options are responsible for the video size:
you can use either videoQuality (we have also prepared enum VideoQuality with a set of trendy resolutions)
All that with the "Default" - the settings "default", as it is not difficult to guess. They will be used to receive local video, for new incoming and outgoing calls. And now the promised focus:
Couple with Call in the title is a little more difficult. These functions are designed to receive and set preferences during a call that is already in progress. When you change the settings, the CallEvents.Updated event will occur .
The remaining two functions are CameraManager. Very powerful, but complex. There are three interesting values ​​in enum VideoQuality :
VIDEO_QUALITY_HIGH;
VIDEO_QUALITY_MEDIUM;
VIDEO_QUALITY_LOW.
A logical question: “Igor! If the Web SDK knows the best, worst and average quality, then why is there no method for getting a list of the resolutions supported by the webcam ?! ”Tsymes is that he does not know.
And in order to find out, you need to literally go through all possible permissions. What managed to get - are supported. It sounds pretty simple, but in fact it takes up to several minutes. Therefore, we cannot do this in WebSDK initialization, for example. So that you can do it at the most comfortable moment for the user - if at all you want to do it - and there is testResolutions . It would be nice to save the result of the function to localStorage and later load via loadResolutionTestResult when reloading the page:
And last but not least, the most popular one: AudioDeviceManager . The class is very similar to CameraManager . For sound only, your settings are reflected in the AudioParams interface:
strict - true is responsible for strict adherence to your will;
outputId - the choice of speakers. This option works only in Chrome, other browsers will ignore it. You can get a list of devices via getOutputDevices ;
noiseSuppression - noise reduction. By default, noise suppression is enabled and works fine. But there may be a partial loss of voice, if the user works in a room where they talk a lot, for example in a call center. The filter may just be a little mistaken and cut too much. If you have problems of this nature, you should set this option to false ;
echoCancellation - removes the echo from the microphone. It should be disabled along with the previous option;
autoGainControl - includes an automatic microphone gain control mechanism built into the operating system. The option has no side effects, but its operation is highly dependent on the operating system and equipment. Sometimes it may just not work and the gain will be constant, which is also not bad.