IEEE 1588 Precision Time Protocol (PTP)

Many articles are written about the well-known Network Time Protocol (NTP), some of them mention Precision Time Protocol, which supposedly allows to achieve time synchronization accuracy of the order of nanoseconds (for example, here and here ). Let's see what this protocol is and how such accuracy is achieved. And also we will look at the results of my work with this protocol.

Introduction

The “Precision Time Protocol” is described by the IEEE 1588 standard . There are 2 versions of the standard. The first version was released in 2002, then the standard was revised in 2008 and the PTPv2 protocol came to light. Backward compatibility has not been maintained.
I work with the second version of the protocol, there are many improvements in it compared to the first (accuracy, stability, as the wiki tells us). I will not make comparisons with NTP, the mere mention of synchronization accuracy, and the accuracy of PTP reaches really tens of nanoseconds with “iron” support, speaks about the advantage over NTP.
"Iron" protocol support in different devices can be implemented in different ways. In fact, the minimum required for the implementation of PTP is the ability of the piece of hardware to put down the time stamp of the moment of receiving the message on the port. The time stamp will be used to calculate the error.

Why do watches get upset?

Errors can occur from anywhere. To begin with, the frequency generators in the devices are different and there is a very low probability that two different devices will work perfectly in tact. Here you can also assign constantly changing environmental conditions affecting the generated frequency.

What do we want?

Suppose we have a device that works in ideal conditions, some atomic clocks that do not disperse at all until the end of the world (of course, to the real, and not predetermined by the Mayan calendar) and given the task to get at least approximately (up to 10 ^-9 sec) same hours. We need to synchronize this watch. To do this, you can implement the PTP protocol.
')

The difference of pure software implementation and implementation with "iron support"

Pure software implementation will not achieve the promised accuracy. The time elapsed from the moment of receiving a message (or rather, receiving a signal to receive a message in the device) before moving to the entry point to an interrupt or to a callback cannot be strictly defined. “Smart hardware” with PTP support can put these time stamps on their own (for example, Micrel chips , I’m writing a driver for KSZ8463MLI).
In addition to time stamps, the ability to tune a quartz oscillator (to equalize the frequency with the master), or the ability to adjust the clock (to increase the clock value by X ns each time) can also be attributed to the “iron” support. About this below.

We turn to the standard IEEE 1588

The standard is already described on 289 pages. Consider the minimum required to implement the protocol. PTP is a client-server synchronization protocol, i.e. to implement the protocol requires at least 2 devices. So, the Master device is an atomic clock, and the Slave device is a clock that needs to be made to work accurately.

Exchange language

Announce message - announcement message, contains information sent by the master to all Slave devices. Slave device using this message can choose the best master (for this there is BMC (Best Master Clock) algorithm). BMC is not so interesting. This algorithm can be easily found in the standard. The selection is made in such message fields as accuracy, variance, class, priority, etc. Let us turn to other messages.

Sync / Follow Up, DelayResp, PDelayResp / PDelayFollowUp - are sent by the master, below we will consider them in more detail.

DelayReq, PDelayReq - requests for slave devices.

As you can see, the Slave device is not verbose, the Master provides almost all the information itself. Sending is carried out on Multicast (if you wish, you can use Unicast mode) the addresses strictly defined in the standard. There is a separate address for PDelay messages (01-80-C2-00-00-0E for Ethernet and 224.0.0.107 for UDP). The remaining messages are sent to 01-1B-19-00-00-00 or 224.0.1.129. Packets are distinguished by the ClockIdentity (clock ID) and SequenceId (packet ID) fields.

Work session

Suppose the master was selected using the BMC algorithm, or the master on the network is the only one. The picture shows the communication procedure of the main device and the synchronized one.

It all starts with the fact that the Master sends a Sync message and simultaneously records the sending time t1. There are one- and two-stage modes of operation. Distinguishing them is very easy: if a FollowUp message is present , then we are dealing with a two-step implementation, the optional messages are shown with a dotted arrow
The FollowUp message is sent after the Sync and contains the time t1. If transmission is carried out in one step, then Sync contains t1 in the message body. In any case, t1 will be received by our device. At the time of receiving the Sync message on the Slave, the timestamp t2 is generated. So we get t1, t2
Slave generates the message DelayReq simultaneously with the generation of t3
Master receives DelayReq message while generating t4
t4 is sent to the Salve device in a DelayResp message

^{Network messages}

Using this exchange session, which is shown above, one can succeed only if quartz generates perfectly identical frequencies for synchronized devices. In fact, it turns out that the clock frequency is different, i.e. on one device for 1 second, the value of hours will increase by 1 second, and on the other, for example, by 1.000001 second. From here there is a discrepancy of hours.
The standard describes an example of calculating the ratio of the time elapsed on the Master and on the Slave for a certain interval. This ratio will be the coefficient for the slave frequency of the device. But at the same time there is an indication that the adjustment can be carried out in various ways. Consider two of them:

Change the clock frequency of the Slave device (example in standard)
Do not change the clock frequency, but for each clock cycle with duration T the value of the clock will not increase by T, but by T + ∆t (used in my implementation)

In both methods, you will need to calculate the difference in time values on the Master device for a certain interval, as well as the difference in time, for the same interval on the Slave device. The coefficient in the first method:

The second method requires the calculation of ∆t. ∆t is the value that will be added to the time value every defined interval. In the figure you can see that while 22 - 15 = 7 seconds passed on the master, 75+ (87-75) / 2 - (30+ (37-30) / 2) = 47.5 passed on the Slave

Frequency - processor frequency, for example, 25 MHz - processor cycle lasts 1 / (25 * 10 ⁶ ) = 40 ns.
Depending on the capabilities of the device, the most appropriate method is chosen.
To move on to the next section, let's express the offset a bit differently:

Modes of operation PTP

Looking into the standard, you can find not the only way to calculate the delivery time. There are 2 modes of operation PTPv2. This is E2E (End-to-End) , it was discussed above, P2P mode (Peer-to-Peer) was also described. Let's see where the way to apply and what is their difference.
In principle, you can use any of the modes as desired, but they cannot be combined on the same network.

In E2E mode, the delivery time is calculated by messages received through multiple devices, each of which puts the Sync message or FollowUP in the correction field (if two-step transmission) the time the packet was delayed on this device (if the devices are connected directly, the correction is not stamped, therefore we will not consider them in detail). Used messages: Sync / FollowUp, DelayReq / DelayResp
In P2P mode, in the correction field, not only the time for which the packet was delayed is added, (t2-t1) is added to it (can be read in the standard). Sync / FollowUp, PDelayReq / PDelayResp / PDelayRespFollowUp messages are used.

According to the standard, clocks through which PTP messages pass with a change in the correction field are called Transparent Clock (TC) . Let's look at the pictures, how messages are transmitted in these two modes. The blue arrows indicate the Sync and FollowUp messages .

^{End-to-End mode}

^{Peer-to-Peer Mode}
We see that some red arrows appeared in P2P mode. These are the remaining messages that we did not consider, namely PDelayReq , PDelayResp and PDelayFollowUp . Here is the session of these messages:

Delivery time error

The standard describes the implementation of the protocol in various types of networks. I used an Ethernet network and received messages at the Ethernet level. In such networks, the package delivery time is constantly changing (especially noticeable when working with nanosecond precision). To filter these values, various filters are applied.

What is required to filter:

Time of delivery
∆t
Bias

About the same filtering system is used in my driver as in the PTPd Linux daemon , the source code of which can be found here there is some information here . I will give only the scheme:

LP IIR (Infinite Impulse Response low-pass) filter (Filter with infinite impulse response), described by the formula:

where s is the coefficient that allows you to adjust the filter cut.

Adjustment calculation

Let us turn to the adjustment, to the delta, which should be added to the value of the second. The calculation scheme used in my system:

I used the Kalman filter to filter out a strong jitter due to network interference, I really liked this article . In general, you can use any filter that you like, the main thing is to smooth the schedule. In PTPd , for example, filtering is simpler — the average of the current and previous value is calculated. On the graph you can see the results of the Kalman filter in my driver (an error of adjustment is shown, expressed in subnoseconds on a 25 MHz chip):

We proceed to adjust the adjustment, the adjustment should strive for a constant, a PI controller is used. The PTPd adjusts the offset of the clock (the setting goes on the offset), but I use it to adjust the adjustment (feature KSZ8463MLI). We see that the controller is not perfectly configured, but in my case such adjustment is sufficient:

Work result

The result is shown in the graph. Clock offset within -50ns to 50ns. Consequently, I have achieved the accuracy described in numerous articles. Of course, many minor details of the implementation remained behind the scenes, but the necessary minimum was demonstrated.

Source: https://habr.com/ru/post/163253/

All Articles