
Cloud telephony VoxImplant can receive calls and make calls to different sources. Cell phones, SIP, mobile applications, web pages. You can call from cell to web page, it looks fascinating. If everything is clear with cellular, the ability to talk to the browser requires something other than HTML and JavaScript. Previously, this "something" was Flash. And we still know how to use it as a fallback option. But over the past few years, popular browsers do not call via Flash at all, but through HTML5 technology called WebRTC. Which until recently was introduced in Chrome and Firefox. But everything flows, everything changes, and support for WebRTC appeared in the beta version of Microsoft Edge. Nearly. Microsoft has traditionally gone its own way and made an "alternative" implementation, which is called "ORTC". How do they differ and what our developers had to go through - read under the cut.
What kind of beast is WebRTC?
What is WebRTC? It is accessible from the JavaScript API, which allows you to do four things:
- Capture video stream from camera and audio stream from microphone.
- Play video and audio (via HTML5 video and HTML5 audio).
- Establish a UDP (or TCP if everything is bad) connection between two browsers, either through an intermediate server or directly, including nat traversal.
- Stream video, audio and user data over the established connection.
In fact, it replaces Flash for working with video / sound and allows you to do hangouts, skype for web and other peer-to-peer video and voice conferences. Without a flash and with built-in browser confirmation "give access to your camera and microphone."
')
Details in which the Devil is hidden
The biggest challenge when using WebRTC is setting up a connection. The API is “sharpened” for the nat penetration scenario, when both users have IP addresses like “192.168 ...” and need to juggle UDP packets in order to trick intermediate NAT servers and start sending data. There is no “connect” method, even if we want to establish a connection with a server that is guaranteed a public IP address. Everything will have to be done manually.
The second difficult point is codecs. Capturing and compressing video, transmitting it over a network and playing back are interrelated processes with many nuances. When calling between two browsers, especially different, you need to agree on a codec, analyze the network bandwidth, change the bitrate, video resolution. And still video and sound can turn on and off. And you can also intervene in the process and force the bitrate.
And WebRTC is quite strongly tied to SDP - an ancient text protocol used in voip telephony and compatible with SIP. And if you need to intervene in the process of communication, for example, set a fixed bitrate, then you will need to parse and change this text.
There is no WebRTC in the edge!
Microsoft found the WebRTC API too complicated for JavaScript developers and implemented an alternative, Object Real-Time Communications. In terms of protocols, ORTC works in much the same way as WebRTC. But the JavaScript API accessible from the browser was written from scratch in object-oriented style. SDP no longer sticks out, the text does not need to be parsed, everything is controlled through objects and their fields.
ORTC was added to the WebRTC standard and the rest of the web browsers started to implement it, there is already a partial implementation in Firefox. It all sounds interesting and promising, until we find out what ...
ORTC is not implemented anywhere
Edge contains an incomplete implementation of the year-old version of ORTC. And at the moment there is no "full" implementation of the ORTC. Unlike WebRTC, which has been available for many years in Chrome and Firefox.
By the way, there are no working polyfiles (WebRTC API emulators on top of the ORTC API in the browser). That is, they are there, but they are not ready for commercial exploitation, and further the demo does not work. And this is exactly what we developed. Because making a polyfill is much easier than rewriting a working and debugged SDK to support two fundamentally different APIs.
And in the Edge ORTC is not fully implemented
It was the most painful thing. The ORTC implementation now available in beta seems to have been created for Skype for Web. Good documentation allows you to quickly assemble a voice or video call from Edge to Edge. But if you call on Firefox or your own server, nuances begin to emerge.
In the ORTC standard, there is support for “Trickle ICE”, which speeds up the connection. There are even corresponding methods in Edge, but it is not written anywhere that they cannot be used for such a scenario. Many things are implemented incompatible with Chrome and Firefox. For example, authorization for ICE or codecs with the same name, but with different payload type.
There are no Failbacks. If you take a step to the right or left, for example, create a receiver without data and transfer it to connect, then we get only an error code and nothing else. Until recently, these codes did not even have a description, the only way to find out was to ask Microsoft. Recently, a brief description of the return codes has been laid out and life has become a little easier, but the API still assumes the “only correct” use case and severely punishes any attempt to move away from it.
And there are codecs!
Codecs for video and sound is a separate pain. Traditionally, WebRTC uses H.264 and VP8 for video, Opus and g.711 for sound. Edge offers only a minimum of codecs for Skype: H.264UC for video (inherited from Microsoft Lync), g.711 and, until recently, its own implementation of Opus for sound. The good news is that they recently added the “regular” Opus and promise to add support for VP9. Bad news - VP9 has not added yet. So the sound between different browsers can already be transferred, but the video will have to wait a bit.
A light in the end of a tunnel
In fact, everything is not so bad, and our developers quickly made an Edge SDK, which we plan to offer you along with the release of the appropriate version of the browser. The good news is that WebRTC (or already ORTC?) Is being developed and supported by almost all browsers, with the exception of Safari. Rumor has it that Apple hires developers to work on WebRTC, and the first implementation appeared in WebKit nightly. It is time to abandon Flash, not only for playing videos, but also for calls.