📜 ⬆️ ⬇️

SIP client interaction. Part 2



In the previous article, we looked at the simple interaction of SIP clients without using a proxy server. Such interaction in practice is extremely rare, but it is excellent for understanding the basics of SIP.

Now that we have dealt with the basic things, I propose to go over to the actual operation of the protocol.

In this article I plan to consider three questions:
  1. Selection of transport protocol and search for proxy;
  2. Work through a proxy;
  3. Registration on the proxy server.

Select a transport protocol and search for proxy


Since the SIP protocol supports several transport protocols (UDP, TCP, SCTP, TLS), it is necessary to somehow determine which protocol to use. For this there are several ways.
')
The first method involves the explicit indication of the transport in the SIP URI (except TLS). It looks like this:



If the transport is not explicitly specified, then the following algorithm works:
  1. If the SIP URI contains an IP address, UDP is used for the SIP URI, and TCP for SIPS (Secure SIP).
  2. If the IP address is not specified, but the port is specified, then the SIP URI is UDP, and the SIPS is TCP.
  3. If there is no IP address and port, but there is a corresponding NAPTR record in the DNS, then “SIP + D2U” corresponds to UDP, “SIP-D2T” indicates TCP and “SIP-D2S” indicates SCTP. NAPTR contains a link to the SRV record that will be used to search for the Proxy server. If NAPTR remains, then a query should be executed to find the SRV record.
  4. The result of the SRV request will be the name and port of the proxy server.
  5. If there is no SRV record, an A or AAAA request is made. In this case, UDP is used for the SIP URI, and TCP is used for SIPS.

In order to better understand, consider an example when we want to contact the client sip: ivan@domain.ru:



So, we found out the parameters of Ivan's Proxy server. Now I propose to consider the use of Proxy in the framework of the SIP-dialogue.

Remark for those who do not know what NAPTR. I learned that there is such a type of DNS record only when I wrote this article, so do not despair. A little more about NAPTR here .

Proxy Interaction


Why do we need SIP Proxy? As I said, in the example from the 1st part of the article, clients knew each other's IP addresses and could communicate directly. In real life, clients often receive addresses dynamically, so there is no point in “remembering” one or another IP. The first thing that comes to mind in this situation is to use A-records DNS and determine the real valid address. However, the following problem lies here: the IP address identifies the particular device, not the user on it. The peculiarity of SIP interaction is that the message exchange takes place not at the device-device level, but on user-user. At the same time, one user can simultaneously use several SIP clients: on a mobile phone, on a work computer, on a home computer, and on a SIP phone. How to be?

SIP proposes the following solution: a SIP Proxy is created and each user registers his device with this Proxy (more precisely, users register with the registration server, and the Proxy has access to the registration database, but for simplicity, we assume that it is the same server). How this is done, I will show below. For now, just remember that Proxy knows exactly how to find a particular client user.

When Peter calls Ivan, the following sequence of actions is performed:
  1. Peter's SIP client determines Ivan’s SIP Proxy address and protocol (see above for how to do this)
  2. The client sends an INVITE request to the proxy.
  3. The proxy server looks at which devices Ivan has registered with and sends a request to all these devices.
  4. Ivan answers the call on one of the devices and sends 200 OK to the Proxy
  5. Proxy redirects 200 OK to Peter
  6. Peter receives Ivan’s SIP address on a specific device from the Contact field in the light 200 OK and sends the answer directly, bypassing Prxoy
  7. All subsequent communication also goes directly.

On the diagram, it looks like this:



For those who have studied the first part of the article, everything looks quite familiar; only intermediate proxy server was added. Accordingly, the messaging has changed slightly.

Before we go into detail, a small remark. Within SIP, two types of URIs are shared . The first of these is the user URI, also known as the address of recorf (AOR). A request sent to this address involves searching the proxy database and sending the request to one or more devices. The second is the device URI (or rather, the user on the device). The device URI is usually called a contact and is contained, respectively, in the Contact field of a SIP message. AOR is contained in the From and To fields.



Start of conversation

So, Peter sends INVITE for Ivan to the proxy server:


The proxy server redirects the request to all Ivan's SIP clients. For simplicity, suppose Ivan uses only one device. In order for the SIP client to understand that the request was redirected through a proxy, the server adds its via header field:



Ivan's SIP client sends the answer to 180 Ringing (Ivan hears the call). At the same time, he adds a tag in the To field and indicates his contact in the Contact field. In addition, in the first via field, the received parameter was added, this parameter shows from which address the client Ivan received the request (that is, the address of the proxy server, as Ivan sees it). This is useful to know to solve problems:



The proxy accordingly forwards the request to Peter’s client. At the same time he removes his via:



After sending 180 Ringing, as soon as Ivan answers the call, Ivan’s client sends a 200 OK response to Prxoy:



Proxy sends this answer to Peter, removing at the same time via:



Now the fun part. Peter's client sends an ACK message directly to Ivan's client, bypassing the proxy. Moreover, if Ivan simultaneously used several SIP clients, the answer came precisely to the one that was needed. What makes it possible?

200 OK sent from the client on which Ivan picked up the phone. Moreover, in the Contact field of the 200 OK response there is a URI corresponding to the user Ivan on a specific device. Thus, the client of Peter sends the ASC to this device, after which the participation of the Proxy is no longer required:



All other messages, including media traffic, bypass the proxy.

End of conversation

At the end of the conversation, Ivan’s client sends BYE directly to Peter’s client:



Peter responds by confirming:


Here everything is as in the first part of the article.

So, we have considered the interaction of SIP-clients with the participation of the Prox-server. Only one question remains: how did Proxy find out Ivan’s clients ’addresses? Using the registration procedure. How this happens, I will tell below.

SIP registration


Registration is as follows:



Let's take a closer look at each of the messages. Ivan sends a Register request to the server (for simplicity, we assume that the registration server role is installed on proxy.domain.ru). The most important thing in this query is the Contact field. This is Ivan’s address on a specific device:



In response, the server sends 401 Unauthorized, i.e. authorization request. The most important field in the answer is WWW-Authenticate. It is not difficult to guess that realm is a domain, and algorithm indicates which hash algorithm we will use. Interest is the nonce field:



Nonce is short for number used once. Nonce is a one-time random sequence that Ivan’s client will combine with a password string, after which he will generate an MD5 hash from the resulting string and put the result in a new request in the WWW-Authenticate field (in fact, it is somewhat more complicated, but for simplicity, we assume that all that way). To do this, use the response parameter.

Why do you need nonce? If the client generated MD5 from the password and did not use nonce, then the hash would be the same each time. An attacker could intercept such a hash and use it for authorization. It would be as insecure as passing a password in clear text.

If you use nonce, MD5 is taken each time from a new line and turns out to be different. Therefore, even intercepting the hash, the attacker is likely to not be able to use it for authorization.

By the way, please note that the new registration request has a CSeq one more:



The server also combines nonce with Ivan's password and receives an MD5 hash. After that, he compares his hash with the hash received from Ivan. If they match, the server sends 200 OK. Notice that the expires parameter has been added to the Contact field. In this case, the registration will be stored in the server database for 3600 seconds or one hour:



If Ivan wants to renew the registration, he must send another REGISTER during this hour.

What if Ivan uses several SIP-enabled devices at once? Everything is very simple - you need to send a request for registration from each of them.

After the corresponding entry appears in the database of the registration server, the proxy server will be able to redirect requests to Ivan's SIP clients.

Bonus for those who are interested


You may have noticed that, in response to a registration request, the server sends a response containing To-tag:



It is clear that when installing the dialog, this tag helps to avoid re-receiving the same message. There is a rule for this: if the message does not contain To-tag and the UAS has already received a message with the same CSeq, From-tag and Call-ID, then the message is discarded. What is the need for To-tag, if we do not establish a dialogue with the registration server. The best answer I could find is that in RFC 3261 it is written that a 200 OK answer that comes to a request without To-tag should contain To-tag. That is, it is not necessary for anything, but it is so accepted.

I hope that the operation of the SIP protocol, after reading the article, has become more understandable to you. I will be glad to your comments.

Source: https://habr.com/ru/post/189332/


All Articles