理解NTP协议
https://www.dazhuanlan.com/yukkiai/topics/1124557
NTP 协议用来校准服务器的时间. 本文详细介绍原理和协议格式
时钟同步的过程
- A 发送 ntp 消息到 B, 消息里含发送时间戳 T1.
- B 收到 ntp 消息后, 将接受时间 T2 写入该消息体.
- 当 B 发送 ntp 响应消息给 A 时, 将发送时间 T3 也写入该消息体
- A 收到响应 ntp 消息的时间为 T4
那么
round-trip 为: (T4 - T1) - (T3 - T2)
时间偏移为: ((T2 - T1) + (T3 - T4)) / 2
ntp 请求消息和响应消息格式完全一样, 使用 udp 协议. 默认的 ntp 服务器监听端口是 123
如下 chronyd(centos 下默认的 ntp 软件) 正在监听 123 端口
$ lsof -i:123
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
chronyd 2439 chrony 3u IPv4 23270 0t0 UDP *:ntp
ntp 消息格式
ntp 消息由消息头,扩展字段,可选的鉴权码组成. 在实际使用中, 一般只携带消息头. 如下是消息头的具体格式
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|LI | VN |Mode | Stratum | Poll | Precision |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Root Delay |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Root Dispersion |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reference ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Reference Timestamp (64) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Origin Timestamp (64) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Receive Timestamp (64) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Transmit Timestamp (64) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
时间格式有两种, 它们代表从1900年1月1号0 时 UTC 时间至今的秒数
- NTP Timestamp Format
8 个字节, 前 32 位表示秒数, 后 32 位表示1/2的32次方
秒
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seconds |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Fraction |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- NTP Short Format
4 个字节, 前 16 位表示秒数, 后 16 位表示1/2的16次方
秒
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seconds | Fraction |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- 消息头各字段解释:
LI Leap Indicator (leap): 2 比特, 用来警告是否有闰秒或者未和上级同步. 具体取值和含义如下:
+-------+----------------------------------------+
| Value | Meaning |
+-------+----------------------------------------+
| 0 | no warning |
| 1 | last minute of the day has 61 seconds |
| 2 | last minute of the day has 59 seconds |
| 3 | unknown (clock unsynchronized) |
+-------+----------------------------------------+
VN Version Number: 3 比特,指定 ntp 版本 Mode (mode): 3 比特, 指定工作模式, 通常我们使用 3,4 代表客户端 - 服务端模式
+-------+--------------------------+
| Value | Meaning |
+-------+--------------------------+
| 0 | reserved |
| 1 | symmetric active |
| 2 | symmetric passive |
| 3 | client |
| 4 | server |
| 5 | broadcast |
| 6 | NTP control message |
| 7 | reserved for private use |
+-------+--------------------------+
Stratum (stratum): 8 比特指定阶层. 通常 Server 更新该字段.
顶层分配为数字 0。一个通过阶层 n 同步的服务器将运行在阶层 n + 1。 阶层为 0 的是高精度计时设备,例如原子钟(如铯、铷)、GPS 时钟或其他无线电时钟。它们生成非常精确的脉冲秒信号,触发所连接计算机上的中断和时间戳。阶层 0 设备也称为参考(基准)时钟. 阶层 1 服务器连接阶层 0 的设备, 它们也被称为主要(primary)时间服务器。
更具体的取值含义如下:
+--------+-----------------------------------------------------+
| Value | Meaning |
+--------+-----------------------------------------------------+
| 0 | unspecified or invalid |
| 1 | primary server (e.g., equipped with a GPS receiver) |
| 2-15 | secondary server (via NTP) |
| 16 | unsynchronized |
| 17-255 | reserved |
+--------+-----------------------------------------------------+
Poll: 8 比特符号整数, 指示与下一次 ntp 同步的最短时间间隔. 值为 4, 则表示 16(2 的 4 次方) 秒
Precision: 8 比特符号整数, 指示时间精度. log2 秒. -18 为微妙
Root Delay: 总的 round-trip delay 到 Primary server. 单位是 NTP Short Format
Root Dispersion: 单位是 NTP Short Format
Reference ID: 32 比特指示服务端的参考时钟 (即上层服务器信息). 即上层时钟源 . 当阶层为 1 时, 那么上层是原子钟等设备. 没 IP, 所以使用 ascii 字符 从阶层 2 开始, 表示 IP 地址.
Reference Timestamp: 指示服务端自身系统时间最后一次被设置的时间戳.通常每 Poll 一次更新一下
Origin Timestamp: 客户端发起时间
Receive Timestamp: 服务端接受时间
Transmit Timestamp: 服务端发送时间
抓包实例
tcpdump -i eth0 port 123 -nnv
抓取 ntp 的详细报文, 样例如下:
17:19:18.566860 IP (tos 0x0, ttl 45, id 33978, offset 0, flags [none], proto UDP (17), length 76)
1.80.235.52.31681 > 192.168.1.247.123: NTPv4, length 48
Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 3 (8s), precision -6
Root Delay: 1.000000, Root dispersion: 1.000000, Reference-ID: (unspec)
Reference Timestamp: 0.000000000
Originator Timestamp: 0.000000000
Receive Timestamp: 0.000000000
Transmit Timestamp: 3726119958.485983999 (2018/01/28 17:19:18)
Originator - Receive Timestamp: 0.000000000
Originator - Transmit Timestamp: 3726119958.485983999 (2018/01/28 17:19:18)
17:19:18.566899 IP (tos 0x0, ttl 64, id 38707, offset 0, flags [DF], proto UDP (17), length 76)
192.168.1.247.123 > 1.80.235.52.31681: NTPv4, length 48
Server, Leap indicator: (0), Stratum 3 (secondary reference), poll 3 (8s), precision -24
Root Delay: 0.005340, Root dispersion: 0.002105, Reference-ID: 182.92.12.11
Reference Timestamp: 3726119511.971055101 (2018/01/28 17:11:51)
Originator Timestamp: 3726119958.485983999 (2018/01/28 17:19:18)
Receive Timestamp: 3726119958.566745462 (2018/01/28 17:19:18)
Transmit Timestamp: 3726119958.566777029 (2018/01/28 17:19:18)
Originator - Receive Timestamp: +0.080761462
Originator - Transmit Timestamp: +0.080793029
相关软件使用
chronyd 是 centos7 引入的新的 ntp 软件, 代替老的 ntpd
如下命令检测当前与上层 ntp 的同步状态
$ chronyc sources -v
210 Number of sources = 2
.-- Source mode '^' = server, '=' = peer, '#' = local clock.
/ .- Source state '*' = current synced, '+' = combined , '-' = not combined,
| / '?' = unreachable, 'x' = time may be in error, '~' = time too variable.
|| .- xxxx [ yyyy ] +/- zzzz
|| Reachability register (octal) -. | xxxx = adjusted offset,
|| Log2(Polling interval) --. | | yyyy = measured offset,
|| | | zzzz = estimated error.
|| | |
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* time5.aliyun.com 2 10 377 733 +165us[ +206us] +/- 5534us
^- 120.25.115.19 2 10 377 633 -1155us[-1155us] +/- 65ms
使用 ntpd 软件包时检查 ntp 状态
when 指多少秒后再一次同步
poll 指下一次同步的时间间隔 单位:秒
reach 与上层服务器已成功连接的次数
delay 指 RRT 单位: 毫秒
offset 指时间偏移值 单位: 毫秒
ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
+time5.aliyun.co 10.137.38.86 2 u 11 64 1 25.353 -113.04 65.276
*203.107.6.88 10.137.55.181 2 u 9 64 1 45.441 -148.44 89.070
ntpdate -q XX.XX.XX.XX
查询与上层 ntp 服务器的时间偏移量. 不会更新. -d
打开 Debug 模式
$ ntpdate -q ntp3.aliyun.com
server 203.107.6.88, stratum 2, offset 0.797972, delay 0.06859
11 Feb 23:02:36 ntpdate[21763]: step time server 203.107.6.88 offset 0.797972 sec
ntp 配置最佳实践:
假设 ntp 服务器为 ntp1.aliyun.com, 配置如下:
# 上层服务器配置
server ntp1.aliyun.com iburst
restrict ntp1.aliyun.com nomodify notrap nopeer noquery
# default 指所有IP, 首先默认显示连接无法执行任何操作
restrict default kod nomodify notrap nopeer noquery
# 针对ipv6的配置
restrict -6 default kod nomodify notrap nopeer noquery
# 配置内网IP可查询不可修改时间
restrict xx.xx.xx.xx mask xx.xx.xx.xx nomodify notrap nopeer
man ntp_acc
查看 restrict 各参数含义man ntp_clock
查看 server 各参数含义
参考
- NTP4 RFC: https://tools.ietf.org/html/rfc5905
- NTP Best Practice: https://tools.ietf.org/id/draft-reilly-ntp-bcp-01.html
===========
https://zhuanlan.zhihu.com/p/106069365
NTP(Network Time Protocol,网络时间协议)是由RFC 1305定义的时间同步协议,用来在分布式时间服务器和客户端之间进行时间同步。NTP基于UDP报文进行传输,使用的UDP端口号为123。
使用NTP的目的是对网络内所有具有时钟的设备进行时钟同步,使网络内所有设备的时钟保持一致,从而使设备能够提供基于统一时间的多种应用。
对于运行NTP的本地系统,既可以接收来自其他时钟源的同步,又可以作为时钟源同步其他的时钟,并且可以和其他设备互相同步。
NTP工作原理
NTP的基本工作原理如图所示。Device A和Device B通过网络相连,它们都有自己独立的系统时钟,需要通过NTP实现各自系统时钟的自动同步。为便于理解,作如下假设:
在Device A和Device B的系统时钟同步之前,Device A的时钟设定为10:00:00am,Device B的时钟设定为11:00:00am。
Device B作为NTP时间服务器,即Device A将使自己的时钟与Device B的时钟同步。
NTP报文在Device A和Device B之间单向传输所需要的时间为1秒。
系统时钟同步的工作过程如下:
Device A发送一个NTP报文给Device B,该报文带有它离开Device A时的时间戳,该时间戳为10:00:00am(T1)。
当此NTP报文到达Device B时,Device B加上自己的时间戳,该时间戳为11:00:01am(T2)。
当此NTP报文离开Device B时,Device B再加上自己的时间戳,该时间戳为11:00:02am(T3)。
当Device A接收到该响应报文时,Device A的本地时间为10:00:03am(T4)。
至此,Device A已经拥有足够的信息来计算两个重要的参数:
NTP报文的往返时延Delay=(T4-T1)-(T3-T2)=2秒。
Device A相对Device B的时间差offset=((T2-T1)+(T3-T4))/2=1小时。
这样,Device A就能够根据这些信息来设定自己的时钟,使之与Device B的时钟同步。
NTP的报文格式
NTP有两种不同类型的报文,一种是时钟同步报文,另一种是控制报文。控制报文仅用于需要网络管理的场合,它对于时钟同步功能来说并不是必需的,这里不做介绍。
主要字段的解释如下:
l LI(Leap Indicator):长度为2比特,值为“11”时表示告警状态,时钟未被同步。为其他值时NTP本身不做处理。
l VN(Version Number):长度为3比特,表示NTP的版本号,目前的最新版本为3。
l Mode:长度为3比特,表示NTP的工作模式。不同的值所表示的含义分别是:0未定义、1表示主动对等体模式、2表示被动对等体模式、3表示客户模式、4表示服务器模式、5表示广播模式或组播模式、6表示此报文为NTP控制报文、7预留给内部使用。
l Stratum:系统时钟的层数,取值范围为1~16,它定义了时钟的准确度。层数为1的时钟准确度最高,准确度从1到16依次递减,层数为16的时钟处于未同步状态,不能作为参考时钟。
l Poll:轮询时间,即两个连续NTP报文之间的时间间隔。
l Precision:系统时钟的精度。
l Root Delay:本地到主参考时钟源的往返时间。
l Root Dispersion:系统时钟相对于主参考时钟的最大误差。
l Reference Identifier:参考时钟源的标识。
l Reference Timestamp:系统时钟最后一次被设定或更新的时间。
l Originate Timestamp:NTP请求报文离开发送端时发送端的本地时间。
l Receive Timestamp:NTP请求报文到达接收端时接收端的本地时间。
l Transmit Timestamp:应答报文离开应答者时应答者的本地时间。
l Authenticator:验证信息。
NTP的工作模式
设备可以采用多种NTP工作模式进行时间同步:
客户端/服务器模式
对等体模式
广播模式
组播模式
用户可以根据需要选择合适的工作模式。在不能确定服务器或对等体IP地址、网络中需要同步的设备很多等情况下,可以通过广播或组播模式实现时钟同步;客户端/服务器和对等体模式中,设备从指定的服务器或对等体获得时钟同步,增加了时钟的可靠性。
1. 客户端/服务器模式
在客户端/服务器模式中,客户端向服务器发送时钟同步报文,报文中的Mode字段设置为3(客户模式)。服务器端收到报文后会自动工作在服务器模式,并发送应答报文,报文中的Mode字段设置为4(服务器模式)。客户端收到应答报文后,进行时钟过滤和选择,并同步到优选的服务器。
在该模式下,客户端能同步到服务器,而服务器无法同步到客户端。
2. 对等体模式
在对等体模式中,主动对等体和被动对等体之间首先交互Mode字段为3(客户端模式)和4(服务器模式)的NTP报文。之后,主动对等体向被动对等体发送时钟同步报文,报文中的Mode字段设置为1(主动对等体),被动对等体收到报文后自动工作在被动对等体模式,并发送应答报文,报文中的Mode字段设置为2(被动对等体)。经过报文的交互,对等体模式建立起来。主动对等体和被动对等体可以互相同步。如果双方的时钟都已经同步,则以层数小的时钟为准
3. 广播模式
在广播模式中,服务器端周期性地向广播地址255.255.255.255发送时钟同步报文,报文中的Mode字段设置为5(广播模式)。客户端侦听来自服务器的广播报文。当客户端接收到第一个广播报文后,客户端与服务器交互Mode字段为3(客户模式)和4(服务器模式)的NTP报文,以获得客户端与服务器间的网络延迟。之后,客户端就进入广播客户端模式,继续侦听广播报文的到来,根据到来的广播报文对系统时钟进行同步。
4. 组播模式
在组播模式中,服务器端周期性地向用户配置的组播地址(若用户没有配置组播地址,则使用默认的NTP组播地址224.0.1.1)发送时钟同步报文,报文中的Mode字段设置为5(组播模式)。客户端侦听来自服务器的组播报文。当客户端接收到第一个组播报文后,客户端与服务器交互Mode字段为3(客户模式)和4(服务器模式)的NTP报文,以获得客户端与服务器间的网络延迟。之后,客户端就进入组播客户模式,继续侦听组播报文的到来,根据到来的组播报文对系统时钟进行同步。
===========
https://weberblog.net/packet-capture-network-time-protocol-ntp/
What’s the first step in a networker’s life if he wants to work with an unknown protocol: he captures and wiresharks it. ;) Following is a downloadable pcap in which I am showing the most common NTP packets such as basic client-server messages, as well as control and authenticated packets. I am also showing how to analyze the delta time with Wireshark, that is: how long an NTP server needs to respond to a request.
As always in my “packet capture” blogposts you are invited to download the following pcap (zipped, 16 KB) and to open it with Wireshark to have a look at it by yourself:
This file consists of many different NTP packet types. Hence I am using display filters within Wireshark to have a look at specific scenarios. The standard UDP destination port for NTP is 123, while the source port *might* be 123 as well.
Have a look at the current NTPv4 RFC 5905 “Network Time Protocol Version 4: Protocol and Algorithms Specification” in order to understand the packets and protocol details. Looking on the wire you should understand the packet header (section 7.3 in the RFC). Note that I am NOT explaining the NTP algorithm at all, but only the packets and their fields that are present on the network. The most important fields are:
- leap indicator: “2-bit integer warning of an impending leap second to be inserted or deleted in the last minute of the current month […].”
- version: “3-bit integer representing the NTP version number, currently 4.”
- mode: The most common modes are client (3) and server (4). This is the basic client-server unicast request which you’ll see all over your network. Other modes are “symmetric active” (2) between NTP peers and “NTP control message” (6) for controlling/polling NTP servers.
- stratum: The stratum value gives the distance to the reference clock. While the reference clock (if one is used) internally has a stratum value of 0, the NTP server that syncs to that clock has a stratum value of 1. That is: When a server replies with stratum 1, it is directly connected to a reference clock. An NTP server that receives its time from a stratum 1 server increases the value by 1, that is: 2. ;) You won’t see values greater than 4 on the Internet that often. Supported are values up to 15, while 16 means unsynchronized.
- reference ID: “32-bit code identifying the particular server or reference clock.” For stratum 1 servers this is an ASCII string telling you the reference clock such as GPS, PPS or DCFa/DCFp. Above stratum 1 this is either the IPv4 address of the reference NTP server or for IPv6 “it is the first four octets of the MD5 hash of the IPv6 address.” <- D’oh! This looks quite strange. ;( Since I am merely using IPv6 for this NTP blog post series you’ll always see these curious-looking “refid” values in the ntpq output.
- transmit timestamp: “Time at the server when the response left for the client.” This is the most interesting timestamp in those NTP packets since it shows the time the NTP client/server had as it sent the NTP packet. If you roughly want to know the time by looking at an NTP packet, look at this transmit timestamp.
- key ID & MAC: Only present when you’re using NTP authentication. The key ID is the number of the key while the MAC is the message digest (currently MD5 or SHA-1, not to be confused with the Ethernet MAC address).
These variables are seen on the wire for NTP packets. Note that on any NTP server or client you have a couple of columns that are listed in many documentation and are NOT part of the packets but of calculations by the NTP algorithms. Those are when, poll, reach, delay, offset, and jitter. Have a look at the blogpost from Aaron Toponce “Real Life NTP” in which he describes these columns of ntpq (among other things). Or, of course, at the official ntpq documentation.
Basic Client-Server
In my pcap, udp.stream eq 21 shows a basic client to server communication. An NTP client asks a server for the time. In the answer of the server, you can see its stratum (1) and reference clock (DCFa). Normally an NTP communication is ongoing over the lifetime of the ntp service running; it queries the server at the “poll” interval. You can see this behaviour in udp.stream eq 2 where my NTP server asks (as a client) another NTP server on the Internet. The polling interval, in this case, was 64 seconds, the stratum of the server was 2, while the reference ID shows the IPv4 address (or the first bytes of the MD5 hash of the IPv6 address) of the reference from the queried NTP server.
Symmetric Active
When you’re running multiple NTP servers connected as “peers” rather than “server” (refer to the ntp.conf manpage) in order to sync their clocks against each other, you’ll see symmetric active (mode 1) packets on the wire. udp.stream eq 1 shows the peering between two of my stratum 1 NTP servers.
Control
You can send control packets to NTP servers for setting and getting specific information. I am using queries via ntpq from my monitoring server to poll some stats from the NTP servers. (Details are covered in an upcoming blog post.) An example is udp.stream eq 15 in which my monitoring server polled the peers from the NTP server via “ntpq -p ntp1.weberlab.de”. All active connections were sent back to this monitoring server, one by one. Hence a couple of NTP packets within a few milliseconds.
Authentication: MD5, SHA-1, & NAK
For NTP authentication there are two extension fields added to the packets: the key ID and the message authentication code MAC. (I am covering NTP authentication in a couple of other posts in detail as well.) Depending on the authentication method, MD5 or SHA-1, the length of the MAC differs. udp.stream eq 33 shows an MD5 authentication, udp.stream eq 9 a SHA-1, and udp.stream eq 0 a failure in the authentication, namely a crypto-NAK. Refer to RFC 7822 (Network Time Protocol Version 4 (NTPv4) Extension Fields): “If a MAC is used, it resides at the end of the packet. This field can be either 24 octets long, 20 octets long, or a 4-octet crypto-NAK.”
NTP Delta Time
Due to my Wireshark bug report aka feature request “NTP Analysis: Delta time between Client-Server“, one of the core developers, Pascal Quantin, added the field ntp.delta_time in which Wireshark calculates the time between the client’s request and the corresponding server’s response (similar to the dns.time or http.time fields). You can see this calculated value in square brackets [as always for Wireshark-added fields]. Additionally, I have added a column in my Wireshark GUI to show these values, as you can see in this screenshot for udp.stream eq 2:
Furthermore, you can use the “IO Graphs” from Wireshark to display the ntp.delta_time for certain connections. In the following graph you can see the analysis of udp.stream eq 2 again, while the Y-axis shows the ntp.delta_time field. Since this particular NTP client sent an NTP request every 64 seconds, you can see those ticks in the graph, as well as one spike near 1040 seconds of the trace:
Yeah, that’s it for now. Have a look at your own network and verify the kinds of used NTP versions/servers/stratums/reference clocks/delta_times and so on. ;)
============= End