📜 ⬆️ ⬇️

RTCKit: Voice and video chat API in the browser

image

The most popular IP telephony protocol is currently SIP. It allows you to interact with most software and hardware phone components, and is also supported by many services. There are several decent implementations of the stack of this protocol in C ( PJSIP , Linphone ) or Python ( B2BUA , p2p-sip ) languages, with the help of which it is easy to embed voice and video communications into a desktop or server application.

Problem


The situation with web applications is quite different: today's browsers do not yet support the possibility of using telephony without additional plug-ins. Work in this direction is undoubtedly underway. There is some hope for the WebRTC project supported by Google and the W3C, but unfortunately, even it is not a panacea. Firstly, the prospect of its support in Internet Explorer is very vague, and secondly, it still does not support the SIP protocol. And besides, this technology is some future. And what to do if there is a desire to embed SIP-telephony in a web application now?
')
First, we define the requirements. From the real-time communication technology inside the browser, we want the following:


Current state of affairs


After a careful study of the current state of web browsers, it becomes clear that currently only one technology meets these requirements, namely Adobe Flash. The technology is closed and not without oddities, but over the years, Adobe has brought it to a more or less decent state. A huge amount of content on the network requires Flash Player, and therefore it is installed by most users.

Flash is a browser plug-in made on ActiveX technology for Internet Explorer and using NPAPI for all other browsers. The plugin can load swf-files and execute the bytecode contained in them. But most importantly: Flash Player can interact with the sound card and the user's webcam, take sound and image from them and encode them using modern audio and video codecs. For example, the Speex / 16000 codec is ideally suited to us in terms of sound quality / compression ratio.

Unfortunately, Flash does not allow direct use of standard TCP and UDP protocols on which it would be possible to build a SIP client. Instead, it is proposed to use their own protocols RTMP and RTMFP to transfer voice and video data. The first disappears immediately, as it is built on TCP, but RTMFP is just what we need. It works on top of UDP, which means it achieves minimal delays and is resistant to network interference.

After we decided on the technology, there are questions related to implementation. On the client side, there seems to be nothing complicated. You need to write a Flash application that is embedded in a web page and implements two-way communication via the RTMFP protocol. This will require some development in ActionScript (essentially JavaScript with support for classes and modules).

But on the server side you need to do the conversion of RTMFP to SIP. At the time of this writing, none of the open-source projects (red5, rtmplite and others) did not support this feature. Even the commercial Adobe Flash Media Server, in conjunction with Flash Media Gateway, only supports the conversion of RTMP <-> SIP, not to mention the fact that the price of server products from Adobe is not very affordable.

Decision


These actions require a lot of effort, and all for the sake of one seemingly simple opportunity: voice and video communication in your application. In this regard, we had the idea to make a cloud service, which all these difficulties take upon themselves. Using our many years of experience working on talkpad.ru , we have created an API for communication within web applications - RTCKit.com

image

The service allows you to embed an invisible Flash-component WebPhone into your application and manage it using an intuitive JavaScript API . WebPhone sends voice and video data to our cloud, we convert them to SIP and back, and, as a result, you get the full range of modern IP telephony services.

How can this be used? For example, if you are a telephony service provider via SIP, you can offer your subscribers to call directly from your website, without installing anything. Or, for example, through RTCKit you can interact with your PBX and organize a corporate conference call right in the browser.

Usage example


For example, consider the following use case. Suppose you want to make your own click2call service like Zingaya and offer it to online stores that want their customers to call them directly from their website at their own expense without installing additional software. No problem, this is done like this:

1. For voice grounding on landline and cell phones, register an account on talkpad.ru , sipnet.ru or any other SIP-provider.

2. We place on the web server a page into which we load the RTCKit JavaScript library and write the following code. Instead of <username> and <password>, we substitute the account data from the previous paragraph.

<head> <script src="http://rtckit.com/api/swfobject.js"></script> <script src="http://rtckit.com/api/rtckit.js"></script> <script> window.addEventListener('load', function() { RTCKit.webPhone.onConnectStateChanged = function(connectInfo) { if (connectInfo.connected) { RTCKit.webPhone.register({ registrar: 'talkpad.ru', username: '<username>', password: '<password>' }); } } RTCKit.webPhone.onRegStateChanged = function(regInfo) { if (regInfo.registered) RTCKit.webPhone.call({'uri': 'sip:<number>@talkpad.ru'}); } RTCKit.webPhone.embed({container: 'container'}); RTCKit.webPhone.connect() }, false); </script> </head> <body> <div id='container' style='width: 217px; height: 140px'></div> </body> 


3. We make it so that the page opens in a pop-up window by clicking on the “call” button on the online store page, and instead of <number> we substitute the telephone number of the sales department in an international format (for example, 74951234567). Click2call service is ready!

You can try to register with a SIP-provider and call directly in your browser on a test bench without setting up an account on RTCKit. The page already contains test credentials, or you can use your own.

In conclusion, I would like to say that the WebPhone is only the first step in creating a modern cloud-based API, covering various telephony needs. We are developing several new interesting features for RTCKit, and we will write about them as soon as they are ready.

UPDATE: a problem that prevented a microphone from working on Linux was found and fixed. Now it should work everywhere.

UPDATE: now we also support video calls in the browser. You can read here .

Source: https://habr.com/ru/post/131575/


All Articles