Principles of operation of the PIM protocol

PIM is a set of protocols for multicast transmission in a network between routers. Neighborhood relationships are constructed in the same way as in the case of dynamic routing protocols. Every 30 seconds, PIMv2 sends Hello messages to the reserved multicast address 224.0.0.13 (All-PIM-Routers). The message contains Hold Timers - usually 3.5 * Hello Timer, that is, 105 seconds by default.

PIM uses two main modes of operation - Dense and Sparse mode. Let's start with Dense mode.
Source-Based Distribution Trees.
Dense-mode mode is advisable to use in the case of a large number of clients of different multicast groups. When the router receives multicast traffic, it first checks for RPF rule. RPF - this rule is used to check the source of a multicast with a unicast routing table. It is necessary that the traffic comes to the interface, behind which this host is hidden according to the version of the unicast routing table. This mechanism solves the problem of a loop occurring during multicast transmission.

R3 from the multicast of the message finds out the source of the multicast (Source IP) and checks the two streams from R1 and R2 according to its unicast table. The stream from the interface indicated by the table (R1 to R3) will be transmitted further, and the stream from R2 will be dropped, since in order to get to the multicast source, you need to send packets on S0 / 1.
The question is what will happen if you have two equivalent routes with the same metric? In this case, the router will choose by next-hop on these routes. Who has the above ip address, he won. If you need to change this behavior, you can use ECMP. Read more here .
After checking the RPF rule, the router sends a multicast packet to all its PIM neighbors, except for the one from whom the packet was received. The remaining PIM routers repeat this process. The path that passed the multicast packet from the source to the final recipients forms a tree, which is called the source-based distribution tree, shortest-path tree (SPT), source tree. Three different names, choose any.
How to solve the problem with the fact that some multicast stream did not give up to some routers and send it to no one, and the upstream router sends it to it. For this, the Prune mechanism was invented.
Prune Message.
For example, R2 will continue to send R3 multicast, although R3, according to the RPF rule, drops it. Why load the channel? R3 sends PIM Prune Message and R2, upon receiving this message, will remove the S0 / 1 interface from the list (outgoing interface list) for this thread, the list of interfaces from which this traffic should be sent.

The following is a more formal definition of a PIM Prune message:
This is a case of the SPT.

After receiving the Prune message, R2 sets the Prune timer to 3 minutes. After three minutes, he will start sending traffic again until he receives another Prune message. This is in PIMv1.
And in PIMv2, a State Refresh timer has been added (60 seconds by default). As soon as a Prune message was sent from R3, this timer starts at R3. After this timer expires, R3 will send a State Refresh message that will reset the 3-minute Prune Timer to R2 for this group.
Reasons for sending a Prune message:

When the multicast package failed the RPF check.
When there are no locally connected clients that requested a multicast group (IGMP Join) and no PIM neighbors that can send multicast traffic (Non-prune Interface).

Graft Message.
Imagine that R3 did not want traffic from R2, sent Prune and received a multicast from R1. But suddenly, the channel fell between R1-R3 and R3 was left without multicast. You can wait 3 minutes until the Prune Timer expires on R2. 3 minutes to wait for a long time, in order not to wait, you need to send a message that instantly displays this interface S0 / 1 to R2 from the pruned state. This message will be a graft message. After receiving the Graft message, R2 will send a Graft-ACK in response.
Prune Override.

Let's look at this scheme. R1 broadcasts a multicast segment with two routers. R3 receives and broadcasts traffic, R2 receives, but there is no one to broadcast traffic to it. It sends a Prune message to R1 in this segment. R1 should remove Fa0 / 0 from the list and stop broadcasting to this segment, but what about c R3? And R3 is in the same segment, also received this Prune message and understood the whole tragedy of the situation. Before R1 stops broadcasting, it sets the timer to 3 seconds and stops broadcasting after 3 seconds. 3 seconds - just so much time at R3, so as not to lose your multicast. Therefore, R3, as soon as possible, sends Pim Join a message for this group and R1 no longer thinks to stop broadcasting. About Join posts below.
Assert message.

Imagine this situation: two routers broadcast to the same network at once. Receive the same stream from the source, and both broadcast it to the same network behind the e0 interface. Therefore, they need to determine who will be the only one broadcaster for this network. Assert messages are used for this. When R2 and R3 detects the duplication of multicast traffic, that is, a multicast arrives at R2 and R3, which they broadcast themselves, the routers understand that there is something wrong. In this case, the routers send Assert messages, which include Administrative Distance and the route metric with which the multicast source is reached - 10.1.1.10. The winner is determined by:

The one who is below AD.
If AD are equal, then someone below the metric.
If there is equality, then the one who has higher IP in the network to which they broadcast this multicast.

Winning this vote, the router becomes the Designated Router. Pim Hello is also used to select DR. At the beginning of the article, the PIM Hello message was shown, there you can see the DR field. The winner is the one who has the above IP address on this link.
Useful label:

MROUTE Table.
After an initial review of the PIM protocol, we need to figure out how to work with the multicast routing table. The mroute table stores information about which threads were requested from clients and which streams are streamed from multicast servers.
For example, when IGMP Membership Report or PIM Join is received on an interface, an entry of the type (*, G) is added to the routing table:

This record indicates that a traffic request was received with the address 238.38.38.38. The DC flag means that the multicast will operate in the Dense mode and C means that the recipient is directly connected to the router, that is, the router received the IGMP Membership Report, and PIM Join.
If there is an entry of type (S, G), it means that we have a multicast stream:

In the S field - 192.168.1.11, we have the IP address of the multicast source specified, it will be checked by the RPF rule. In case of problems, the first step is to check the unicast table for the route to the source. In the Incoming Interface field indicates the interface to which the multicast arrives. In the unicast routing table, the route to the source must refer to the interface specified here. In Outgoing Interface it is specified where the multicast will be redirected. If it is empty, it means that the router has not received any requests for this traffic. More information on all flags can be found here .
PIM Sparse-mode.
Sparse-mode strategy is the opposite of Dense-mode. When Sparse-mode receives multicast traffic, it will send traffic only through those interfaces where there were requests for this stream, for example, Pim Join or IGMP Report messages with a request for this traffic.
Similar elements for SM and DM:

Neighborhood relationships are built in the same way as in PIM DM.
The RPF rule works.
The choice of DR is similar.
The Prune Overrides mechanism and Assert messages are similar.

To control who, where and what multicast traffic is needed on the network, you need a common information center. This center will be our Rendezvous Point (RP). Anyone who wants some multicast traffic or someone has started to receive multicast traffic from the source, then he sends it to the RP.
When RP receives multicast traffic, it will send it to those routers who have requested this traffic before.

Imagine a topology where RP is R3. As soon as R1 receives traffic from S1, it will encapsulate this multicast packet into a unicast PIM Register message and send it to the RP. How does he know who RP? In this case, it is configured statically, but let's talk about the dynamic configuration of the RP later.

ip pim rp-address 3.3.3.3

RP will look - was there any information from someone who would like to receive this traffic? Suppose that was not. Then RP will send R1 a PIM Register-Stop message, which means that no one needs this multicast, registration is denied. R1 will not send multicast. But the host of the multicast will send it, so that R1, after receiving the Register-Stop, will start the Register-Suppression timer equal to 60 seconds. 5 seconds before the expiration of this timer, R1 will send an empty Register message with a Null-Register bit (that is, without an encapsulated multicast packet) towards the RP. RP in turn will act like this:

If there are no recipients either, then he will respond with a Register-Stop message.
If the recipients appear, he will not respond to him. R1, having not received a refusal for his registration within 5 seconds, will be delighted and will send the Register message with the encapsulated multicast to RP.

How the multicast comes to RP seems to be sorted out, now let's try to answer the question of how the RP brings traffic to the recipients. Here it is necessary to introduce a new concept - root-path tree (RPT). RPT is a tree with a root in RP growing towards recipients, branching out on each PIM-SM router. RP creates it by receiving PIM Join messages and adds a new branch to the tree. And so, each downstream router does. The general rule is:

When the PIM-SM router receives a PIM Join message on an interface other than the interface behind which the RP is hidden, it adds a new branch to the tree.
A branch is also added when the PIM-SM router receives an IGMP Membership Report from a directly connected host.

Imagine that we have a multicast client on the R5 router to the group 228.8.8.8. As soon as R5 receives the IGMP Membership Report from the host, R5 sends the PIM Join in the direction of RP, and adds an interface looking to the host to the tree. Next, R4 receives PIM Join from R5, adds Gi0 / 1 to the tree, and sends PIM Join in the direction of RP. Finally, RP (R3) gets PIM Join and adds Gi0 / 0 to the tree. Thus, registration of the recipient of the multicast is obtained. We build a tree with the root R3-Gi0 / 0 → R4-Gi0 / 1 → R5-Gi0 / 0.
After that, PIM Join will be sent to R1 and R1 will start sending multicast traffic. It is important to note that if a host requests traffic before multicast broadcasting starts, then RP will not send PIM Join and will not send anything to R1 at all.
If a multicast is sent while the host stops wanting to receive it as soon as the RP receives the PIM Prune on the Gi0 / 0 interface, it will immediately send the PIM Register-Stop directly to R1, and then the PIM Prune message via the Gi0 / 1 interface. PIM Register-stop is sent by a unicast to the address from which the PIM Register arrived.
As we said before, as soon as the router sends a PIM Join to another, for example, R5 to R4, an entry is added to R4:

And the timer is started, that to reset this timer R5 should constantly PIM Join messages constantly, and then R4 will exclude from the outgoing list. R5 will send every 60 PIM Join messages.
Shortest-Path Tree Switchover.
We will add an interface between R1 and R5, let's see how traffic will flow with this topology.

Suppose that the traffic was sent and received according to the old scheme R1-R2-R3-R4-R5 and here we connected and configured the interface between R1 and R5.
First of all, we have to reorganize the unicast routing table on R5 and now the network 192.168.1.0/24 is reached via the interface R5 Gi0 / 2. Now R5 receiving a multicast on the Gi0 / 1 interface, understands that the RPF rule is not satisfied and it would be more logical to get a multicast on Gi0 / 2. It should disconnect from the RPT and build a shorter tree called the Shortest-Path Tree (SPT). To do this, he sends a PIM Join to R1 via Gi0 / 2 and R1 starts sending a multicast via Gi0 / 2. Now R5 need to unsubscribe from the RPT, so as not to get two copies. To do this, it sends a Prune message indicating the source ip address and inserting a special bit - RPT-bit. This means that I do not need to send traffic, I have a better tree here. RP also sends messages towards the R1 PIM Prune, but does not send a Register-Stop message. Another feature: R5 will now continuously send PIM Prune to RP, as R1 continues to send PIM Register to RP every minute. RP until there is a new willing of this traffic will refuse it. R5 notifies RP that he continues to receive multicast via SPT.
RP dynamic search.
Auto-RP.
This technology is proprietary from Cisco and is not particularly popular, but is still alive. The work of Auto-RP consists of two main stages:
1) RP sends RP-Announce messages to the reserved address - 224.0.1.39, declaring itself RP for all or for certain groups. This message is sent every minute.
2) An RP mapping agent is needed, which will send RP-Discovery messages indicating which groups need which RP to listen to. It is from this message that regular PIM routers will determine RP for themselves. A mapping agent can be either the RP router itself or any separate PIM router. RP Discovery is sent to 224.0.1.40 with a one-minute timer.
Let's look at the process in more detail:
Configure R3 as RP:

ip pim send-rp-announce loopback 0 scope 10

R2 as mapping agent:

ip pim send-rp discovery loopback 0 scope 10

And on all others we will expect RP through Auto-RP:

ip pim autorp listener

As soon as we configure R3, it will start sending RP-Announce:

And R2, after setting up the mapping agent, will start waiting for the RP-Announce message. Only when he finds at least one RP will he start sending RP-Discovery:

Thus, as soon as the regular routers (PIM RP Listener) receive this message, they will know where to look for the RP.
One of the main problems of Auto-RP is that in order to receive RP-Announce and RP-Discovery messages, you must send PIM Join to addresses 224.0.1.39-40, and in order to send, you need to know where RP is. The classic problem of chicken and eggs. To solve this problem, the PIM Sparse-Dense-Mode was invented. If the router does not know RP, then it works in the Dense-mode, if it knows, then in the Sparse-mode. When the PIM Sparse-mode is configured on the interfaces of the normal routers and the ip pim autorp listener command, the router will work in the Dense-mode for multicast directly the Auto-RP protocol (224.0.1.39-40).
BootStrap Router (BSR).
This feature works similar to Auto-RP. Each RP sends a message to the mapping agent, which collects the mapping information and then tells the rest of the routers. We describe the process similar to Auto-RP:
1) Once we set up R3 as a candidate to be RP, the command is:

ip pim rp-candidate loopback 0

That R3 will not do anything, in order to start sending a special message, for him to start, you need to find the mapping agent. Thus, we proceed to the second step.
2) Configure R2 as a mapping agent:

ip pim bsr-candidate loopback 0

R2 starts sending out PIM Bootstrap messages, where it indicates itself as a mapping agent:

This message is sent to the address 224.0.013, which the PIM protocol uses for its other messages. He sends them in all directions and therefore there is no problem of chicken and eggs, as was the case in Auto-RP.
3) As soon as the RP receives a message from the BSR of the router, it will immediately send a unicast message to the address of the BSR of the router:

After that, BSR, having received information about RP, will send them a multicast to the address 224.0.0.13, which is listened to by all PIM routers. Therefore, the analogue of the ip pim autorp command listener for normal routers is not in the BSR.
Anycast RP with Multicast Source Discovery Protocol (MSDP).
Auto-RP and BSR allow us to distribute the load on the RP as follows: Each multicast group has only one active RP. It will not be possible to do load sharing for one multicast group of several RPs. The MSDP does this by issuing the same address ip with the mask 255.255.255.255 to the routers issuing the RP. MSDP learns information using one of the methods: static, Auto-RP or BSR.

In the picture we have Auto-RP configuration with MSDP. Both RPs are configured with an ip address of 172.16.1.1/32 on the Loopback 1 interface and is used for all groups. With RP-Announce, both routers talk about themselves, referring to this address. After receiving the information, the Auto-RP mapping agent sends out RP-Discovery about RP with the address 172.16.1.1/32. About the network 172.16.1.1/32, we tell the routers using IGP and, respectively. Thus, PIM routers request or register streams from the RP that is specified as next-hop from the route to the network 172.16.1.1/32. The MSDP protocol itself is intended for the RPs themselves to exchange messages about multicast information.
Consider this topology:

Switch6 broadcasts traffic to 238.38.38.38 and so far only RP-R1 knows about it. Here Switch7 and Switch8 requested this group. Routers R5 and R4 will send PIM Join to R1 and R3, respectively. Why? The route to 13.13.13.13 for R5 will refer to R1 by the IGP metric, as well as by R4.
RP-R1 knows about the stream and starts broadcasting it in the direction of R5, but R4 does not know anything about it, since R1 just will not send it. Therefore, MSDP is required. We configure it on R1 and R5:

ip msdp peer 3.3.3.3 connect-source Loopback1 to R1

ip msdp peer 1.1.1.1 connect-source Loopback3 to R3

They will raise the session between each other and, upon receiving any flow, will report it to their RP to the neighbor.
RP-R1 as soon as it receives the stream from Switch6, immediately sends a message to the MSDP Source-Active unicast, containing information of the type (S, G) - information about the source and destination of the multicast. Now, when RP-R3 knows that such a source as Switch6, when it receives a request from R4 for this stream, it will send to the side of Switch6 PIM Join, guided by the routing table. Consequently, R1, having received such a PIM Join, will begin to send traffic towards RP-R3.
MSDP works over TCP, RPs send each other keepalive messages for viability testing. The timer is 60 seconds.
The function of splitting MSDP peers into different domains remains incomprehensible, since the Keepalive and SA messages do not indicate whether they belong to any domain. Also in this topology, a configuration with different domains was tested - there was no difference in the work.
If someone can clarify, gladly read in the comments.

On this I think to finish the article. Below are useful materials and links that were used:

CCIE Routing and Switching v5.0 Official Cert Guide, Volume 2, Fifth Edition, Narbik Kocharians, Terry Vinson.
Networks for the smallest. Part nine. Multicast

Source: https://habr.com/ru/post/450582/

All Articles

Principles of operation of the PIM protocol

More articles: