A friend recently reminded me of the existence of chrony, a "versatile implementation of the Network Time Protocol (NTP)". The excellent introduction is worth quoting in full:

It can synchronise the system clock with NTP servers, reference clocks (e.g. GPS receiver), and manual input using wristwatch and keyboard. It can also operate as an NTPv4 (RFC 5905) server and peer to provide a time service to other computers in the network.

It is designed to perform well in a wide range of conditions, including intermittent network connections, heavily congested networks, changing temperatures (ordinary computer clocks are sensitive to temperature), and systems that do not run continuosly, or run on a virtual machine.

Typical accuracy between two machines synchronised over the Internet is within a few milliseconds; on a LAN, accuracy is typically in tens of microseconds. With hardware timestamping, or a hardware reference clock, sub-microsecond accuracy may be possible.

Now that's already great documentation right there. What it is, why it's good, and what to expect from it. I want more. They have a very handy comparison table between chrony, ntp and openntpd.

  1. My problem with OpenNTPd
  2. Compared to chrony
  3. Extras
  4. Caveats
  5. Switching to chrony
  6. Update: Debian defaults

My problem with OpenNTPd

Following concerns surrounding the security (and complexity) of the venerable ntp program, I have, a long time ago, switched to using openntpd on all my computers. I hadn't thought about it until I recently noticed a lot of noise on one of my servers:

jan 18 10:09:49 curie ntpd[1069]: adjusting local clock by -1.604366s
jan 18 10:08:18 curie ntpd[1069]: adjusting local clock by -1.577608s
jan 18 10:05:02 curie ntpd[1069]: adjusting local clock by -1.574683s
jan 18 10:04:00 curie ntpd[1069]: adjusting local clock by -1.573240s
jan 18 10:02:26 curie ntpd[1069]: adjusting local clock by -1.569592s

You read that right, openntpd was constantly rewinding the clock, sometimes in less than two minutes. The above log was taken while doing diagnostics, looking at the last 30 minutes of logs. So, on average, one 1.5 seconds rewind per 6 minutes!

That might be due to a dying real time clock (RTC) or some other hardware problem. I know for a fact that the CMOS battery on that computer (curie) died and I wasn't able to replace it (!). So that's partly garbage-in, garbage-out here. But still, I was curious to see how chrony would behave... (Spoiler: much better.)

But I also had trouble on another workstation, that one a much more recent machine (angela). First, it seems OpenNTPd would just fail at boot time:

anarcat@angela:~(main)$ sudo systemctl status openntpd
● openntpd.service - OpenNTPd Network Time Protocol
     Loaded: loaded (/lib/systemd/system/openntpd.service; enabled; vendor pres>
     Active: inactive (dead) since Sun 2022-01-23 09:54:03 EST; 6h ago
       Docs: man:openntpd(8)
    Process: 3291 ExecStartPre=/usr/sbin/ntpd -n $DAEMON_OPTS (code=exited, sta>
    Process: 3294 ExecStart=/usr/sbin/ntpd $DAEMON_OPTS (code=exited, status=0/>
   Main PID: 3298 (code=exited, status=0/SUCCESS)
        CPU: 34ms

jan 23 09:54:03 angela systemd[1]: Starting OpenNTPd Network Time Protocol...
jan 23 09:54:03 angela ntpd[3291]: configuration OK
jan 23 09:54:03 angela ntpd[3297]: ntp engine ready
jan 23 09:54:03 angela ntpd[3297]: ntp: recvfrom: Permission denied
jan 23 09:54:03 angela ntpd[3294]: Terminating
jan 23 09:54:03 angela systemd[1]: Started OpenNTPd Network Time Protocol.
jan 23 09:54:03 angela systemd[1]: openntpd.service: Succeeded.

After a restart, somehow it worked, but it took a long time to sync the clock. At first, it would just not consider any peer at all:

anarcat@angela:~(main)$ sudo ntpctl -s all
0/20 peers valid, clock unsynced

peer
   wt tl st  next  poll          offset       delay      jitter
159.203.8.72 from pool 0.debian.pool.ntp.org
    1  5  2    6s    6s             ---- peer not valid ----
138.197.135.239 from pool 0.debian.pool.ntp.org
    1  5  2    6s    7s             ---- peer not valid ----
216.197.156.83 from pool 0.debian.pool.ntp.org
    1  4  1    2s    9s             ---- peer not valid ----
142.114.187.107 from pool 0.debian.pool.ntp.org
    1  5  2    5s    6s             ---- peer not valid ----
216.6.2.70 from pool 1.debian.pool.ntp.org
    1  4  2    2s    8s             ---- peer not valid ----
207.34.49.172 from pool 1.debian.pool.ntp.org
    1  4  2    0s    5s             ---- peer not valid ----
198.27.76.102 from pool 1.debian.pool.ntp.org
    1  5  2    5s    5s             ---- peer not valid ----
158.69.254.196 from pool 1.debian.pool.ntp.org
    1  4  3    1s    6s             ---- peer not valid ----
149.56.121.16 from pool 2.debian.pool.ntp.org
    1  4  2    5s    9s             ---- peer not valid ----
162.159.200.123 from pool 2.debian.pool.ntp.org
    1  4  3    1s    6s             ---- peer not valid ----
206.108.0.131 from pool 2.debian.pool.ntp.org
    1  4  1    6s    9s             ---- peer not valid ----
205.206.70.40 from pool 2.debian.pool.ntp.org
    1  5  2    8s    9s             ---- peer not valid ----
2001:678:8::123 from pool 2.debian.pool.ntp.org
    1  4  2    5s    9s             ---- peer not valid ----
2606:4700:f1::1 from pool 2.debian.pool.ntp.org
    1  4  3    2s    6s             ---- peer not valid ----
2607:5300:205:200::1991 from pool 2.debian.pool.ntp.org
    1  4  2    5s    9s             ---- peer not valid ----
2607:5300:201:3100::345c from pool 2.debian.pool.ntp.org
    1  4  4    1s    6s             ---- peer not valid ----
209.115.181.110 from pool 3.debian.pool.ntp.org
    1  5  2    5s    6s             ---- peer not valid ----
205.206.70.42 from pool 3.debian.pool.ntp.org
    1  4  2    0s    6s             ---- peer not valid ----
68.69.221.61 from pool 3.debian.pool.ntp.org
    1  4  1    2s    9s             ---- peer not valid ----
162.159.200.1 from pool 3.debian.pool.ntp.org
    1  4  3    4s    7s             ---- peer not valid ----

Then it would accept them, but still wouldn't sync the clock:

anarcat@angela:~(main)$ sudo ntpctl -s all
20/20 peers valid, clock unsynced

peer
   wt tl st  next  poll          offset       delay      jitter
159.203.8.72 from pool 0.debian.pool.ntp.org
    1  8  2    5s    6s         0.672ms    13.507ms     0.442ms
138.197.135.239 from pool 0.debian.pool.ntp.org
    1  7  2    4s    8s         1.260ms    13.388ms     0.494ms
216.197.156.83 from pool 0.debian.pool.ntp.org
    1  7  1    3s    5s        -0.390ms    47.641ms     1.537ms
142.114.187.107 from pool 0.debian.pool.ntp.org
    1  7  2    1s    6s        -0.573ms    15.012ms     1.845ms
216.6.2.70 from pool 1.debian.pool.ntp.org
    1  7  2    3s    8s        -0.178ms    21.691ms     1.807ms
207.34.49.172 from pool 1.debian.pool.ntp.org
    1  7  2    4s    8s        -5.742ms    70.040ms     1.656ms
198.27.76.102 from pool 1.debian.pool.ntp.org
    1  7  2    0s    7s         0.170ms    21.035ms     1.914ms
158.69.254.196 from pool 1.debian.pool.ntp.org
    1  7  3    5s    8s        -2.626ms    20.862ms     2.032ms
149.56.121.16 from pool 2.debian.pool.ntp.org
    1  7  2    6s    8s         0.123ms    20.758ms     2.248ms
162.159.200.123 from pool 2.debian.pool.ntp.org
    1  8  3    4s    5s         2.043ms    14.138ms     1.675ms
206.108.0.131 from pool 2.debian.pool.ntp.org
    1  6  1    0s    7s        -0.027ms    14.189ms     2.206ms
205.206.70.40 from pool 2.debian.pool.ntp.org
    1  7  2    1s    5s        -1.777ms    53.459ms     1.865ms
2001:678:8::123 from pool 2.debian.pool.ntp.org
    1  6  2    1s    8s         0.195ms    14.572ms     2.624ms
2606:4700:f1::1 from pool 2.debian.pool.ntp.org
    1  7  3    6s    9s         2.068ms    14.102ms     1.767ms
2607:5300:205:200::1991 from pool 2.debian.pool.ntp.org
    1  6  2    4s    9s         0.254ms    21.471ms     2.120ms
2607:5300:201:3100::345c from pool 2.debian.pool.ntp.org
    1  7  4    5s    9s        -1.706ms    21.030ms     1.849ms
209.115.181.110 from pool 3.debian.pool.ntp.org
    1  7  2    0s    7s         8.907ms    75.070ms     2.095ms
205.206.70.42 from pool 3.debian.pool.ntp.org
    1  7  2    6s    9s        -1.729ms    53.823ms     2.193ms
68.69.221.61 from pool 3.debian.pool.ntp.org
    1  7  1    1s    7s        -1.265ms    46.355ms     4.171ms
162.159.200.1 from pool 3.debian.pool.ntp.org
    1  7  3    4s    8s         1.732ms    35.792ms     2.228ms

It took a solid five minutes to sync the clock, even though the peers were considered valid within a few seconds:

jan 23 15:58:41 angela systemd[1]: Started OpenNTPd Network Time Protocol.
jan 23 15:58:58 angela ntpd[84086]: peer 142.114.187.107 now valid
jan 23 15:58:58 angela ntpd[84086]: peer 198.27.76.102 now valid
jan 23 15:58:58 angela ntpd[84086]: peer 207.34.49.172 now valid
jan 23 15:58:58 angela ntpd[84086]: peer 209.115.181.110 now valid
jan 23 15:58:59 angela ntpd[84086]: peer 159.203.8.72 now valid
jan 23 15:58:59 angela ntpd[84086]: peer 138.197.135.239 now valid
jan 23 15:58:59 angela ntpd[84086]: peer 162.159.200.123 now valid
jan 23 15:58:59 angela ntpd[84086]: peer 2607:5300:201:3100::345c now valid
jan 23 15:59:00 angela ntpd[84086]: peer 2606:4700:f1::1 now valid
jan 23 15:59:00 angela ntpd[84086]: peer 158.69.254.196 now valid
jan 23 15:59:01 angela ntpd[84086]: peer 216.6.2.70 now valid
jan 23 15:59:01 angela ntpd[84086]: peer 68.69.221.61 now valid
jan 23 15:59:01 angela ntpd[84086]: peer 205.206.70.40 now valid
jan 23 15:59:01 angela ntpd[84086]: peer 205.206.70.42 now valid
jan 23 15:59:02 angela ntpd[84086]: peer 162.159.200.1 now valid
jan 23 15:59:04 angela ntpd[84086]: peer 216.197.156.83 now valid
jan 23 15:59:05 angela ntpd[84086]: peer 206.108.0.131 now valid
jan 23 15:59:05 angela ntpd[84086]: peer 2001:678:8::123 now valid
jan 23 15:59:05 angela ntpd[84086]: peer 149.56.121.16 now valid
jan 23 15:59:07 angela ntpd[84086]: peer 2607:5300:205:200::1991 now valid
jan 23 16:03:47 angela ntpd[84086]: clock is now synced

That seems kind of odd. It was also frustrating to have very little information from ntpctl about the state of the daemon. I understand it's designed to be minimal, but it could inform me on his known offset, for example. It does tell me about the offset with the different peers, but not as clearly as one would expect. It's also unclear how it disciplines the RTC at all.

Compared to chrony

Now compare with chrony:

jan 23 16:07:16 angela systemd[1]: Starting chrony, an NTP client/server...
jan 23 16:07:16 angela chronyd[87765]: chronyd version 4.0 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +NTS +SECHASH +IPV6 -DEBUG)
jan 23 16:07:16 angela chronyd[87765]: Initial frequency 3.814 ppm
jan 23 16:07:16 angela chronyd[87765]: Using right/UTC timezone to obtain leap second data
jan 23 16:07:16 angela chronyd[87765]: Loaded seccomp filter
jan 23 16:07:16 angela systemd[1]: Started chrony, an NTP client/server.
jan 23 16:07:21 angela chronyd[87765]: Selected source 206.108.0.131 (2.debian.pool.ntp.org)
jan 23 16:07:21 angela chronyd[87765]: System clock TAI offset set to 37 seconds

First, you'll notice there's none of that "clock synced" nonsense, it picks a source, and then... it's just done. Because the clock on this computer is not drifting that much, and openntpd had (presumably) just sync'd it anyways. And indeed, if we look at detailed stats from the powerful chronyc client:

anarcat@angela:~(main)$ sudo chronyc tracking
Reference ID    : CE6C0083 (ntp1.torix.ca)
Stratum         : 2
Ref time (UTC)  : Sun Jan 23 21:07:21 2022
System time     : 0.000000311 seconds slow of NTP time
Last offset     : +0.000807989 seconds
RMS offset      : 0.000807989 seconds
Frequency       : 3.814 ppm fast
Residual freq   : -24.434 ppm
Skew            : 1000000.000 ppm
Root delay      : 0.013200894 seconds
Root dispersion : 65.357254028 seconds
Update interval : 1.4 seconds
Leap status     : Normal

We see that we are nanoseconds away from NTP time. That was ran very quickly after starting the server (literally in the same second as chrony picked a source), so stats are a bit weird (e.g. the Skew is huge). After a minute or two, it looks more reasonable:

Reference ID    : CE6C0083 (ntp1.torix.ca)
Stratum         : 2
Ref time (UTC)  : Sun Jan 23 21:09:32 2022
System time     : 0.000487002 seconds slow of NTP time
Last offset     : -0.000332960 seconds
RMS offset      : 0.000751204 seconds
Frequency       : 3.536 ppm fast
Residual freq   : +0.016 ppm
Skew            : 3.707 ppm
Root delay      : 0.013363549 seconds
Root dispersion : 0.000324015 seconds
Update interval : 65.0 seconds
Leap status     : Normal

Now it's learning how good or bad the RTC clock is ("Frequency"), and is smoothly adjusting the System time to follow the average offset (RMS offset, more or less). You'll also notice the Update interval has risen, and will keep expanding as chrony learns more about the internal clock, so it doesn't need to constantly poll the NTP servers to sync the clock. In the above, we're 487 micro seconds (less than a milisecond!) away from NTP time.

(People interested in the explanation of every single one of those fields can read the excellent chronyc manpage. That thing made me want to nerd out on NTP again!)

On the machine with the bad clock, chrony also did a 1.5 second adjustment, but just once, at startup:

jan 18 11:54:33 curie chronyd[2148399]: Selected source 206.108.0.133 (2.debian.pool.ntp.org) 
jan 18 11:54:33 curie chronyd[2148399]: System clock wrong by -1.606546 seconds 
jan 18 11:54:31 curie chronyd[2148399]: System clock was stepped by -1.606546 seconds 
jan 18 11:54:31 curie chronyd[2148399]: System clock TAI offset set to 37 seconds 

Then it would still struggle to keep the clock in sync, but not as badly as openntpd. Here's the offset a few minutes after that above startup:

System time     : 0.000375352 seconds slow of NTP time

And again a few seconds later:

System time     : 0.001793046 seconds slow of NTP time

I don't currently have access to that machine, and will update this post with the latest status, but so far I've had a very good experience with chrony on that machine, which is a testament to its resilience, and it also just works on my other machines as well.

Extras

On top of "just working" (as demonstrated above), I feel that chrony's feature set is so much superior... Here's an excerpt of the extras in chrony, taken from the comparison table:

I even understand some of that stuff. I think.

So kudos to the chrony folks, I'm switching.

Caveats

One thing to keep in mind in the above, however is that it's quite possible chrony does as bad of a job as openntpd on that old machine, and just doesn't tell me about it. For example, here's another log sample from another server (marcos):

jan 23 11:13:25 marcos ntpd[1976694]: adjusting clock frequency by 0.451035 to -16.420273ppm

I get those basically every day, which seems to show that it's at least trying to keep track of the hardware clock.

In other words, it's quite possible I have no idea what I'm talking about and you definitely need to take this article with a grain of salt. I'm not an NTP expert.

Update: I should also mentioned that I haven't evaluated systemd-timesyncd, for a few reasons:

  1. I have enough things running under systemd
  2. I wasn't aware of it when I started writing this
  3. I couldn't find good documentation on it... later I found the above manpage and of course the Arch Wiki but that is very minimal
  4. therefore I can't tell how it compares with chrony or (open)ntpd, so I don't see an enticing reason to switch

It has a few things going for it though:

So I'm reserving judgement over it, but I'd certainly note that I'm always a little weary in trusting systemd daemons with the network, and would prefer to keep that attack surface to a minimum. Diversity is a good thing, in general, so I'll keep chrony for now.

It would certainly nice to see it added to chrony's comparison table.

Switching to chrony

Because the default configuration in chrony (at least as shipped in Debian) is sane (good default peers, no open network by default), installing it is as simple as:

apt install chrony

And because it somehow conflicts with openntpd, that also takes care of removing that cruft as well.

Update: Debian defaults

So it seems like I managed to write this entire blog post without putting it in relation with the original reason I had to think about this in the first place, which is odd and should be corrected.

This conversation came about on an IRC channel that mentioned that the ntp package (and upstream) is in bad shape in Debian. In that discussion, chrony and ntpsec were discussed as possible replacements, but when we had the discussion on chat, I mentioned I was using openntpd, and promptly realized I was actually unhappy with it. A friend suggested chrony, I tried it, and it worked amazingly, I switched, wrote this blog post, end of story.

Except today (2022-02-07, two weeks later), I actually read that thread and realized that something happened in Debian I wasn't actually aware of. In bookworm, systemd-timesyncd was not only shipped, but it was installed by default, as it was marked as a hard dependency of systemd. That was "fixed" in systemd-247.9-2 (see bug 986651), but only by making the dependency a Recommends and marking it as Priority: important.

So in effect, systemd-timesyncd became the default NTP daemon in Debian in bookworm, which I find somewhat surprising. timesyncd has many things going for it (as mentioned above), but I do find it a bit annoying that systemd is replacing all those utilities in such a way. I also wonder what is going to happen on upgrades. This is all a little frustrating too because there is no good comparison between the other NTP daemons and timesyncd anywhere. The chrony comparison table doesn't mention it, and an audit by the Core Infrastructure Initiative from 2017 doesn't mention it either, even though timesyncd was announced in 2014. (Same with this blog post from Facebook.)

Created . Edited .