
A month ago, I began my acquaintance with IP telephony, namely with Lync and Asterisk. And he noticed the following picture: there are a lot of interesting articles in the network on the practical side of the issue (how and what to do) and very little attention is paid to theory (references are given at the end of the article). If you want to deal with SIP, then you can either read RFC 3261, or one of "these thick books." This, of course, is useful, but many people want to study some squeeze at the beginning, and only then rush into the pool with their heads. This article is just for such people.
In order not to overload the reader, I decided to split the article into two parts. In the first part, we consider the operation of the SIP protocol in the interaction of two clients.
Simple customer interaction
The interaction of customers within the SIP is most often carried out in the form of a dialogue.
')
A dialogue is an equal interaction of two User Agents (UA) in the form of a sequence of SIP messages between them. At the same time, there are queries that do not form dialogs. However, everything is in order.
The following is an example of a simple interaction between two SIP-enabled devices:

Peter wants to start messaging with Ivan, for this he sends an INVITE message with data about the type of session (simple, multimedia, etc.). Messages have the following format: start line, one or more header fields, an empty string denoting the end of the header fields and an optional message body.

The starting line contains the method, Request-URI and the SIP version (current - 2.0). Request-URI is the SIP address of the resource to which the request is sent.

Header fields have the following format:
<Header>: <Value> <Line feed>The first line starts with the Via header. Each SIP device that creates or forwards a message adds its address in the Via field (as it happens, I plan to show in the next part of the article). Typically, an address is a host name that can be resolved using a DNS query. The Via field contains the SIP version, the “/” sign, a space, the transport protocol (UDP, TCP, TLS, SCTP), the colon, the port number and the branch is the transaction identifier. Responses to this query will contain the same transaction number.

More often than not, the value of a branch begins with “z9hG4bK”. This means that the request was generated by a client that supports RFC 3261 and the parameter is unique for each transaction of this client.
The next field, Max-Forwards, contains a relatively large integer. Each SIP server that forwards the message reduces this number by one. This field provides a simple loop detection mechanism (loop).
Next are the From and To fields, which describe the sender and receiver of the request. It is important that SIP requests are routed based on the Request-URI specified in the starting line (see above). This is explained by the fact that the From and To fields can be changed during forwarding. If a display name is used (for example, Ivan Ivanov), then the SIP URI is placed inside a pair of angle brackets. The tag parameter in the From field is generated by the sending side. In turn, the receiving party will place its tag in the To field.
The Call-ID field is the call ID. The combination of tags from the From and To and Call-ID fields uniquely identify this dialog. This is necessary, since there can be several dialogues between clients at once.
The next field, Cseq, contains the sequence number of the request and the name of the method. In this case, INVTITE. The number increases with each new request.
The Via, Max-Forwards, To, From, Call-ID, and CSeq fields make up the minimum required set of SIP message header fields.

An INVITE message also requires a Contact header field that contains the SIP URI associated with the sending device’s communications device. This field is used so that from all the devices that Peter can use at the same time, the answer was sent to this device.
Pay attention to the values ​​of the From and Contact fields. The first time I did not notice the difference:

The message contains an optional Subject field, that is, the subject of the message. Some SIP clients may display the value of this field on the screen. For routing and identification of the dialogue field is not used and can be arbitrary.
The Content-Type and Content-Length fields are responsible for the description of the message body. In this case, the Session Description Protocol (SDP) will be used. Message size is calculated based on line feed characters:

A detailed description of the work of the SDP protocol deserves a separate article, so the following is only a brief transcript:

In response to INVITE, Ivan’s SIP client sends two messages: 180 Ringing and 200 OK. The first informs that on the side of Ivan the SIP client beeps the bell, the second confirms the setting of the dialogue. We will deal with each of them.
This is how 180 Ringing will look:

Faint highlighted text that has not changed since the INVITE message.
Notice the To and From header fields. Despite the fact that this message comes from Ivan, the values ​​of the fields remain the same as they were in the original request (from Peter to Ivan). This is because these fields determine the direction of the request, not the message.
The string Via also migrated from the original request, at the end of the line the received parameter was added, this parameter contains the IP address from which the request came. This is usually the address that can be obtained by resolving the URI contained in Via.
As I promised, a tag was added to the To field that identifies the dialog. All subsequent messages in the dialogue will contain unchanged tag values.
Finally, the Contact field contains the current address of Ivan.
This is how the 200 OK message sent by Ivan’s SIP client looks:

I think the meaning of all the fields related to the SIP protocol is now clear.
In response to the 200 OK, the client Peter sends a confirmation:

This message confirms that Peter’s client has successfully received a response from Ivan’s client. Both clients have agreed on the parameters of the copper session, which will be implemented via the RTP protocol.
Notice that the CSeq sequence number is still one, but there is already an ACK as a method. The Branch parameter in the Via field contains a new transaction identifier, since the ACK sent in response to the 200 OK is considered a new transaction.
Now let's look at how the media session ends. Peter's client sends a BYE request to end the session:

Having received a request to end the session, Ivan’s client sends a confirmation:

Session is complete.
We considered a simple version of the SIP protocol. Please note that at different points in time, the clients of Ivan and Petra acted either as a server or as a client, so all SIP clients should function as a server (User Agent Server or UAS) and client part (User Agent Client or UAC).
In the next article I plan to consider the interaction of SIP clients using a proxy server and registering clients on a proxy server.
What to read on the topic
1. RFC 3261.
tools.ietf.org/html/rfc32612. Everything you wanted to know about the SIP protocol (three parts). Andrew Pogrebennik.
samag.ru/archive/article/18313. SIP: Understanding the Session Initiation Protocol. Alan B. Johnston.
www.amazon.com/SIP-Understanding-Initiation-Protocol-Telecommunications/dp/1607839954/ref=sr_1_1?ie=UTF8&qid=1375104428&sr=8-1&keywords=sip#4. SIP protocol. Goldstein B.S., Zarubin A.A., Samorezov V.V.
www.vef-kvant.ru/sip.htm