It's quite interesting - this isn't ethernet as we know it. Instead of each NIC using its own free-running clock, all the physical layers are sync'ed to each other at layer 1. (note that gigabit ethernet, which is what it uses, sends data at all times - when idle it sends the idle symbol)
Haven't looked into this in depth but sub-nanosecond sync for systems up to 10km apart is interesting since 10km is about 33 light microseconds. There is some trickery going on.
In our lab tests phase lock jitter between WR client and master is about 10ps (picoseconds) over 50km fiber (with temperature change of the fiber, so WR actively compensates elongations), so relative clock of one system can be transmitted with about that accuracy to another.
P.S. There is WR workshop this week with some talks being publicly available on CERN's indico website.
Even though you're commenting on While Rabbit post, it took some time to understand "WR" is white rabbit, esp. since describing the pico seconds in brackets.
It's totally possible to achieve synchronization better than light transmission time. For the purposes of synchronization, the speed of light delay, and any other delay are indistinguishable, and need not be distinguished.
Hmm one would expect heat expansion to change the length of fiber over tens of kilometers. Does it also affect light speed in the fiber? I think consumer fiber is not buried very deep on average, but maybe for these use cases you use something hefty anyway.
The gravity well time dilation is about 3.5 nanoseconds per meter per year near the surface of the earth. (time changes rate with altitude in a gravity well)
Sub-nanosecond synchronization is getting into the relativity is measurable realm.
Haven't dug in on the technicals, but this is coming out of CERN, it looks like - and in that light, the links to "We're hiring" on that page almost feel like a flex...
Distributed systems spend most of their effort on one problem: agreeing on the order of events across machines. Without synchronized physical clocks you have two options. Logical clocks (Lamport, vector) give you causal order but not wall-clock truth, so you can’t answer “did A really happen before B” for events that don’t have a happens-before relationship. Or you run consensus, which gives total order but costs round trips. At geographic scale that’s tens of milliseconds per decision, and the floor is set by the speed of light.
Tight clock sync collapses this. If clock uncertainty ε is small and bounded, you can timestamp a write, wait ε, and trust the global order without talking to anyone. Spanner’s external consistency works because TrueTime’s ε was a few milliseconds, so commit-wait was tolerable. The latency cost of planet-scale serializability stops depending on how far apart your replicas are and starts depending on how good your clocks are.
That’s the real significance. Time sync converts a coordination problem (bounded by physics) into a local computation (bounded by clock quality). Spanner proved this is possible but required GPS receivers and atomic clocks in every datacenter, which kept the capability inside Google for years. White Rabbit-class sync pushes ε from milliseconds toward sub-nanoseconds over commodity Ethernet hardware, and it’s now in IEEE 1588 as a standard PTP profile. If sub-nanosecond sync becomes baseline network infrastructure, the long-held assumption that strong consistency has to be slow at geographic scale stops holding, and a meaningful chunk of what databases currently work around (HLCs, weak isolation defaults, application-level reconciliation) becomes unnecessary.
If I remember everything is open hardware, CERN should have those repo accessible, last time I used it it was still very much in dev, especially their PCIe cards with custom kernel. This being said, I haven't touched it since ~6 years ago...
It's quite interesting - this isn't ethernet as we know it. Instead of each NIC using its own free-running clock, all the physical layers are sync'ed to each other at layer 1. (note that gigabit ethernet, which is what it uses, sends data at all times - when idle it sends the idle symbol)
P.S. There is WR workshop this week with some talks being publicly available on CERN's indico website.
Since cm precision is often not possible, is roundtrip-length an estimated average from prior roundtrips?
jitter kills
Sub-nanosecond synchronization is getting into the relativity is measurable realm.
In short, it's about giving PTP and SyncE some extra smarts.
Tight clock sync collapses this. If clock uncertainty ε is small and bounded, you can timestamp a write, wait ε, and trust the global order without talking to anyone. Spanner’s external consistency works because TrueTime’s ε was a few milliseconds, so commit-wait was tolerable. The latency cost of planet-scale serializability stops depending on how far apart your replicas are and starts depending on how good your clocks are.
That’s the real significance. Time sync converts a coordination problem (bounded by physics) into a local computation (bounded by clock quality). Spanner proved this is possible but required GPS receivers and atomic clocks in every datacenter, which kept the capability inside Google for years. White Rabbit-class sync pushes ε from milliseconds toward sub-nanoseconds over commodity Ethernet hardware, and it’s now in IEEE 1588 as a standard PTP profile. If sub-nanosecond sync becomes baseline network infrastructure, the long-held assumption that strong consistency has to be slow at geographic scale stops holding, and a meaningful chunk of what databases currently work around (HLCs, weak isolation defaults, application-level reconciliation) becomes unnecessary.
Note that this is also for a large part a hardware-based technology