Two weeks ago we held the second INTERCOM conference on voice and video communications. WebRTC, calls through the browser, machine learning, big data - all this popular story. One of the invited speakers was Tzahi Levent-Levy, better known as the author of
bloggeek.me , the ultimate source of information about WebRTC in modern browsers. In the report (by the way, I have a
video ) Tzakhi talked about the state of the industry and what can now be done with voice and video in browsers. And back in Israel, he wrote an interesting article about Serverless technology when working with communication platforms. Adapted for Habr, I propose a translation under the cut.
When I conducted research for my first
report on platforms with the WebRTC API, one of the companies studied was
Voximplant . They highlighted the thing called “VoxEngine”. As written on the site, this is “the system that executes your javascript code in the Voximplant cloud”. This is Serverless.
I liked the idea, but then I did not particularly think about it. Just a new interesting thing.
What is Serverless Computing?
If you didn’t follow the evolution of the API, you could miss the appearance of “Serverless”. This is an approach in which the code you write is executed in the cloud. Directly. No need to raise the OS, virtualku or container. Write the code and it is executed. Magic.
')
If you look at "Something-there-aaS", then you can google just such a picture:
- If you use your own servers , then you are responsible for ALL
- In the case of IaaS , everything before the operating system is provided by “someone else”. Amazon, Google, Microsoft - well, you understand
- Then PaaS . Everything before runtime is done for you. You only care about the data and the application. And connect to runtime using the API (at least in most cases)
- SaaS is when we are offered a ready-made application or service. As a programmer, we have nothing to worry about.
How does Serverless fit into this picture?
In the case of Serverless, you also develop an application, but it and its data are not managed and supported by you. What do you get from such a decision?
- Scalability - it doesn't need to be taken care of anymore. Someone else is doing this for you now. You have described what you need to do in the code, and now it is the platform’s concern to execute your code on the necessary number of servers.
- Support - less code needs to be written, therefore, less code needs to be supported. In essence, you throw away everything that separates the “staging” code from the “production”. Your prototype can already be run in production.
- Security - if we assume that PaaS vendor is well versed in securing applications, then you have one less headache.
- Time to market - less code for writing also means that you can quickly show your solution to users.
- Delay - since your code runs in the same cloud that the API provides, the delay between your commands and the platform’s response is minimal. For some it is important, for someone not - just one more fact.
What do we get as a result? Scaling savings. The vendor offering the PaaS solution already provides scaling, service support and security for you (and for many other clients). Theoretically, he can do it even better than you. This frees up your resources to implement optimal UX solutions for your users, make the application better and bring it to market faster. Extra bonus: the code is executed as close as possible to the platform API that the service uses (serverless is usually used as an additional option for the platform providing some service and API to it. For example, voice, video and messaging API. -
translator comment ) .
Serverless = Functions
Despite the popularity of the name "Serverless", you can find another: FaaS, "Functions as a Service", which is reflected in products such as
Google Cloud Functions ,
PubNub Functions and
Twilio Function , there are certainly others.
The most common example is probably
AWS Lambda ; and there is also an Open Source
Apache OpenWhisk solution .
Many service providers with an API have begun offering serverless capabilities. You no longer need your own server that communicates with their service; it is enough to execute your code in their “XXX Functions” product.
In some cases, “Functions” services are available for free, but most often vendors want to pay for them using the “pay per usage” model.
Serverless CPaaS
Let's go back to CPaaS (communications platform as a service -
comment of the translator ) and see what they have with Serverless.
I suppose there are only two CPaaS vendors on the market that offer Serverless solutions:
- Voximplant with VoxEngine
- Twilio with Twilio Functions
Jeff Lawson mentioned at the last Signal conference in London that Functions is Twilio’s fastest growing product since launching the service. This functionality is needed by the market.
CPaaS is now quite complex, and the more important it is to understand how serverless is used in them. We will split CPaaS into several API levels and a range of products:
API levels- Scripting languages such as TwiML and NCCO
- REST API
- Client-side SDK (for making and receiving calls from browsers, mobile applications, refrigerators, etc. - comment of translator )
Products- SMS and voice (using phone numbers)
- IP messaging, chat and omnichannel messaging
- VoIP (voice and video using WebRTC)
To some extent, the proprietary API level of scripting languages can be viewed as a very rough form of serverless. You describe the desired behavior as a script that responds to events and gives it to the platform using WebHooks.
REST APIs work well in a serverless architecture: instead of making authorization, security, or scaling requests between servers, they can be done on the same server on which they will run.
And there are client SDKs. They run on end devices, so it's hard to imagine how the serverless concept applies to them. SDKs are designed to interact with the CPaaS backend, so we will not consider them.
Since CPaaS can be grouped by the types of layers used by the API, we can draw the following conclusion:
A few comments:
IP Messaging makes sense to use in a serverless version with large volumes of traffic and low latency requirements.
The delay is usually not so important when it comes to SMS and voice (only in the simplest case, “subscriber A calls subscriber B." makes the difference between "what a useful thing" and "and that it tupit all the time." -
Approx. translator ).
VoIP has its own set of solutions, partly doing the same thing as serverless. Usually these are ready-made widgets and iframes to be placed on web pages (but this is a topic for a separate article).
From the point of view of vendors, serverless is now becoming an increasingly important technology. Why?
Because this is one of the Twilio offers. Twilio's fast-growing offer. In place of a competitor, I would not want to fall behind.
Can I use FaaS service from IaaS vendors?
I really wanted to put these two acronyms in one sentence :)
All major IaaS vendors (Azure, AWS, Google Cloud) now offer serverless in one form or another. Why serverless in CPaaS? Can't we just connect IaaS serverless to CPaaS?
We can. But these will be already two different vendors. Using something like AWS Lambda makes sense if you already use other AWS services.
If you solve communications issues, it’s wiser to use serverless CPaaS. With it, you get reduced delays and better security compared to external serverless solutions.
CPaaS becomes serverless
If you are a CPaaS vendor and are wondering what will happen next, add serverless to the list of what needs to be done urgently and to offer to customers.
If you are a developer and use CPaaS - see how serverless solutions help you build applications faster.
From translator
Once a week, people ask me why we spent so much effort at Voximplant so that JavaScript for managing communications was executed in our cloud. “I don’t need my own server” is, of course, good. But, hand on heart: raising a node, Python or even php in a public cloud is half an hour for an experienced fullstack developer. It's worth it?
Latency Tzakhi talks a lot about them, but does not consider it as the main advantage of the serverless approach. In my experience, it is the absence of a delay between API calls and platform responses, such as voice synthesis and recognition, that plays a key role. At the conference, many companies told and showed automatic systems that communicate with customers and help solve problems without the help of call center specialists.
JS-code is executed on the same server that manages voice and video streams and removes pauses / delays during a call, making communication with the automation natural.
Such things as recognition in real time are especially sensitive to delays. Not so long ago we wrote on Habré how to collect stream recognition in several lines of JS code. Try and appreciate how fast it works!