WebRTC or how I taught our CRM to call phones

The company in which I had the opportunity to work, sells services on the Internet. Every morning, the duty shift disassembles the total stack of accumulated orders and starts calling customers to clarify orders. During the day, operators also accept incoming calls. Before starting my venture, they used such a desktop SIP client for calls:

This dialer was installed on the computer of each employee, received calls and called where necessary. To make any changes to the settings, you had to bypass all the machines and do everything manually. At the same time, if an employee works remotely, he had to advise him by telephone on how to do this. And often it was quite difficult.
')
But the main problem was the lack of integration with our web-system and database. Such seemingly simple tasks as opening a client’s card to an incoming call, saving call statistics for each of the employees and monitoring their activity from the administrative web interface are very difficult to do with the desktop softphone, even if it has the appropriate integration capabilities with browser, for example using plugins.

There was an idea to combine all the internal work and calls in one system and database. I spent a long time finishing our CRM with the function of a built-in dialer with recording conversations.
For the implementation of calls considered a number of technologies and came to the conclusion that there are not so many of them. There was a pair of open-source and commercial implementations, as well as several SAAS services that were not suitable due to internal security policies - to handle calls through their own server.

At the beginning I tried to use sipml5 :

Documentation had to be collected piece by piece from the network. As a result, I got a more or less working phone with a SIP stack on the browser side:

Installation, testing and configuration lasted about 2 weeks, as a result I found a number of small but unpleasant bugs that could not be circumvented, for example, one of them was associated with Websockets settings via SSL. And after the release of Chrome browser version 35 of the web, the phone refused to work at all.

In addition, I did not want to open SIP accounts to operators, and the SIP stack on the browser side assumes their open use and sending via Websockets. Even if Websockets work through SSL, the potential attacker has the opportunity to detach the js code and pull out the SIP password. There was an option to delegate SIP Digest authentication to our Web server, but it was not possible to get to its implementation.

This is what browser-side SIP requests look like in the debug console:

Full access to the SIP stack from Javascript is generally not bad. This has its advantages, for example, you can try to fix some integration bug in JS SIP signaling. But there is one nuance. Slightly more than 90% of SIP vendors do not currently support the RFS 7118 specification of the JS SIP dated January 2014, which means that the webrtc2sip module should work as a stateful SIP proxy and actually duplicate server side SIP stack support. Such an alignment seemed very difficult for further work and support, and I decided to leave the SIP stack on the browser side and find some simpler and more understandable API for such tasks with the server part that could be hosted.

As a result, I started testing Web Call Server . This is not SAAS and allows you to handle calls through your server, which in this case required:

In terms of functions, it is about the same as in sipml5, the same WebRTC calls to SIP and back. There is still support for Flash, but it was not necessary, since all operators use mainly Chrome and Firefox browsers, and those who use IE had to switch to more “correct” browsers.

The load is given to an open source JS softphone that can be redrawn and adapted for a web page.

The main difference from sipml5 is interaction with the server through the API, and not through SIP over Websockets. Those. There is no SIP stack on the browser side. It is located only on the server side. This made the front-end task a bit easier for the developer, since The SIP stack on the browser side threw him into confusion, and when working with the Javascript API and CSS, it became possible to focus on the interface part.

So, how I implemented all this.

1. Took such a server on Amazon EC2 :
Memory and disk space is not much required. Is that for the logs. And the computational power of the CPU in such tasks can be important, so it took not the weakest instance.

2. Raised Apache for the web-interface, installed and launched WCS server.

3. On the chrome page a standard web-phone appeared, the code of which is on github .
I didn’t really like the phone interface, immediately decided to redesign it, and the debugging console on the right turned out to be quite useful. It is a pity that later it had to be removed in order not to frighten a normal user.

4. Tested a web-phone on the ability to call. Used for this our previous SIP-accounts. Everything works as it should. And on the cell phone calls, and on the SIP-phones, and holding calls and transfers, and blackjack and ...

Similarly, phoning with a mobile phone.

5. Adapted the web-phone code for my web-CRM, redrawn its design and now it looks like this:

It is worthwhile to dwell on adaptation in detail, since redrawing the design is not limited.
The first serious task was the automatic registration of a web-phone on a SIP server. Otherwise, the operator would have to re-enter the SIP login and password, after he entered the login and password for the CRM system. There was a question how to integrate.

It turned out that the API has a special function loginByToken for this:

function loginByToken(token) {     trace("Phone - loginByToken "+ token);       connectingViewBeClosed = false;       var result = flashphoner.loginByToken(flashphonerLoader.urlServer, token, document.URL);       closeLoginView();       openConnectingView("Connecting...", 0); }

In order to understand how this function works, I had to try hard.
With the help of documentation and examples, we managed to find out that it all works like this:

1) When creating a token on the CRM side, the AES encryption algorithm is used, which is encrypted with a string that includes the SIP login and user password, as well as other necessary information.

The encryption key is known only to our server, where CRM is deployed, as well as the WCS server. In addition, the validity of the token is specified by a special attribute expires so that it is not possible to reuse it.
Token cryptography occurs in AES CTR mode. Below is an example with openssl, in which an encrypted token is generated with the transfer of a SIP password:

  echo -ne '<root status="ok" description="test" registerRequired="true" login="user5" authenticationName="user5" password="password" outboundProxy="proxy.my" domain="proxy.my" port="5060" visibleName="AAA" api_key="App1" expires="1394839040761100000"/>' | openssl enc -aes-128-ctr -nosalt -K 8263D535FFFFFFFF7B0F60 -iv 00000000000000000000000000000000 | xxd -p

As a result, got something like:

  CRM:cf4693eedaafda1390b261dcf29d45bd3556d64b1f69cd84db8c3ac8721e7e139b80be75e39da18154e897596e9317084faee0d24d6a6197b62a93a2647b263059167b2664179a5866738260c77372e04fe22104ebe1c7530e9215f50d111fd24384755d28d06673e866159c0b6b83289c045619e8481f9c2a6b56b182f393a7dea06b38b7856436895402a5b40f0525a17822ae0f3204b606e4f0169d1ca9176e8e1b696683d12c7db8208946c204e94f3c8ff285f2bcef4ca9b12187cf541ce37d508d3663ef65f944b01db9aea5c0f10002a376d051cbf1b19bc34f76b6d2a4e1ad1450ae412b51b3af1d3860167f5416b3d2c9eeff94d60b82279e8685beb543893e8a09dee640d7366e478d0d1ee7368e0b63b511

On the left, the name of our application is “CRM”, and on the right, the previously created token.
I insert this token into the flashphoner.xml config of the web-phone in the following form:

 <token>CRM:cf4693eed...</token>

In this case, the automatic token registration procedure will start immediately after the page reloads.
2 and 3) loginByToken and decryption.

On the server side, the encryption keys for AES are specified in the config:

  CRM=8263D535FFFFFFFF7B0F60

Thus, when a token with the “CRM:” prefix comes in, the corresponding key is used to decrypt it.

As a result of the decryption, the WCS server receives the previously encrypted string:

 <root status="ok" description="test" registerRequired="true" login="user5" authenticationName="user5" password="password" outboundProxy="proxy.my" domain="proxy.my" port="5060" visibleName="AAA" api_key="App1" expires="1394839040761100000"/>

and from this XML string takes all the data necessary for SIP registration.

3) As soon as the server has decrypted the data, it sends a SIP REGISTER request for SIP and sends a normal Digest authentication to 401 responses using the login and password decoded in the previous SIP step.

 REGISTER sip:sipnet.ru;lr SIP/2.0 Call-ID: 345ec5157b1a66de3a3a275bdba36197@192.168.1.90 CSeq: 2 REGISTER From: <sip:crm1@sipnet.ru>;tag=73a499a8 To: <sip:crm1@sipnet.ru> Via: SIP/2.0/UDP 192.168.1.90:30000;branch=z9hG4bK2622ce723c34760d6a3f43dd631329e1 Max-Forwards: 70 User-Agent: WebRTC Allow: UPDATE,MESSAGE,BYE,ACK,REFER,INVITE,NOTIFY,INFO,OPTIONS,CANCEL Contact: <sip:crm1@192.168.1.90:30000>;expires=3600 Expires: 3600 Authorization: Digest username="crm1",realm="etc.tario.ru",nonce="4A0674BEDF81E0B3F65D",uri="sip:sipnet.ru;lr",response="0762b862c544007f4fb7c43277312a3d",algorithm=MD5,opaque="opaq",qop=auth,cnonce="1234567890",nc=00000001 Content-Length: 0

In this case, only the CRM and Web Call Server itself know the SIP login and password. On the browser, this data in open form do not fall.
Thus, I managed to embed the phone into the operator’s page, without forcing it to store two different accounts - one for CRM, the other for SIP, because it is very inconvenient. Now immediately after the page loads, loginByToken is called and the phone enters the ready state.

Some results of the introduction of browser calls:

1. Calls are now made from the site and are accepted on the site, where all actions are recorded in the system.

2. It became possible to listen to recorded conversations, which helps to resolve conflicts with clients and disagreements between employees. This is important for our distributed office.

3. The number of received calls increased by about 20%. It became clear that the operators did not always pick up the phone when the client called.
At the moment, we can say that everything works as intended. Problem situations were resolved without serious immersion in the SIP materiel.

Among the shortcomings can be noted the impossibility of installation under Windows. By the way, the installation under Linux and integration also had to be tricky and it seems that only an advanced user / developer will master it.

WebRTC audio calls work stably and without any additional browser plug-ins, such as Flash Player. So one can say that I managed to realize the planned integration and two weeks of work were spent not in vain.

Source: https://habr.com/ru/post/224897/

All Articles

WebRTC or how I taught our CRM to call phones

More articles: