
In this article I will talk about what are the methods of implementing voice communications for
web-projects .
The article is an overview and focused on a wide range of readers. However, anyone will find all the necessary links to deepen the essence of the issue.
The following tasks will be considered:- One to one voice communication between site users.
- Voice conference, that is, the conversation of more than 2 interlocutors.
- Calls to landlines and mobile phones from the browser.
I repeat and say that all problems are solved within the framework of the
web environment . The main condition: users do not need to install additional software,
only the browser and Flash Player .
Introduction
First of all, it is worth understanding what technologies are at our disposal. If you are realistic, then in fact the only option is to use Flash. Yes, there are other technologies, but, unfortunately, they are all much less common. While Flash is installed
at almost everyone .
What is good about Flash is convenient work with streaming audio-video. There are two main methods of working with audio streams in Flash applications:
- Using media server (media streaming server). In this case, all voice traffic passes through the server. The server can be Flash Media Server or Red5 (open source).
Advantages: good traffic throughput (firewall and NAT is not a hindrance).
Disadvantages: server load, longer response time, the ability to use only TCP.
')
- New P2P-protocol RTMFP , implemented in Flash Player 10.
Advantages: built on the basis of the UDP protocol, good quality of communication, no load on the server.
Disadvantages: poor passability through the firewall and NAT (about 60% of users), requires Flash Player 10 version.
The optimal solution today: dynamically determine the possibility of using the P2P protocol; use it if possible, otherwise use the first option.
There is also hope that in the near future, Flash servers will allow the use of the UDP protocol to communicate with client applications. In this case, many of the shortcomings of the first solution will disappear. Let me remind you that TCP protocol guarantees data delivery, but UDP does not. For real-time voice traffic transmission, data accuracy is not required, guaranteed delivery time and resistance to periodic transmission channel failures are required. That is why the UDP protocol is preferred in this case.
Let us turn to more specific things.
One on one voice communication
From the point of view of the developer, the two options for implementing audio transmission (with and without relay through the server) are not very different. In both cases, an external server is required. However, in the case of P2P, the server performs only a supporting role in establishing a connection. All voice traffic goes directly from client to client. The server for establishing the P2P connection is called
Stratus . Soon its functionality will be built into Flash Media Server (and, apparently, Red5). Now the only option is to use the public beta service from Adobe.
An excellent article on the use of the new P2P protocol is
here .
An example implementation is
here .
When using a relay server, the task is standard for the Flash environment. What is in this case, that in the case of P2P, the main idea is that each of the interlocutors
publishes the outgoing audio stream and
subscribes to the incoming one. Data is transmitted using the
RTMP protocol (RTMFP, in the case of P2P).
One of the key problems in the implementation of one-on-one voice communication is the signaling of users about incoming calls. If the user is the initiator of the call, he knows at what point he initiates the transmission and reception of voice traffic. As for the user being called, it requires some way to notify him about it. How to solve this problem is a question of a specific application.
- Option 1. Use asynchronous requests that are performed periodically. For example, 1 time per second. The response to the request should contain a sign that there is an incoming call and it is necessary to make a decision whether to answer it. Then, configure the incoming and outgoing audio streams.
- Option 2. Comet-architecture , when the client keeps a constant connection to the server and receives a response only when a certain event has occurred. In this case, the incoming call.
Both options imply the use of server-side basic web development tools (for example, php). Although, in principle, a media server can be adapted for this task. On the client side, JavaScript or Flash can be used.
In the case of one-to-one voice communication, it is sufficient to simply implement the scheme, which above was called the optimal one. That is, use P2P when possible, otherwise - a media server.
Conference organization
When organizing conferences, almost nothing changes. Only now all participants of the conference subscribe to the audio stream of any user at once.
Again, it is possible to implement both through the server and using P2P. But in this case, the probability that P2P will not function is higher for the simple reason that there are not two participants in the exchange, but more: it will not work for anyone.
Calls to landlines and mobile phones
Perhaps the most interesting topic of consideration. To solve the problem, the
SIP- gateway of any IP-telephony operator is used. The scheme of work here is as follows:
- Two-way audio data transfer is organized between the client Flash application and the media server using the RTMP protocol.
- On the side of the media server, transcoding of voice traffic occurs. That is, audio transcoding from one codec to another. Flash supports two voice codecs: Nellymoser and SPEEX (from version 10).
Also, the media server should be able to work with the SIP protocol stack. - Thus, a bridge is built on the media server side: Flash Player <-> SIP.
Red5phone is an open source project that implements the described scheme. The project is rather raw, but is a good starting point.
Working examples
One-on-one voice communication using the P2P protocol is implemented in the
VKontakte social network application. Online, authored by me.
A technical platform was also implemented for making calls from the application to landline and mobile phones via the SIP gateway. However, the public access technology has not been launched.
An example of the implementation of a bunch of Flash <-> SIP:
flaphone project .
Applications
The technologies described have a variety of uses. For example, the use of the CMS system within the engine or the organization of the telephone support service on the company's website, online store.