NTP at NIST Boulder Has Lost Power

(lists.nanog.org)

273 points | by lpage 8 hours ago

17 comments

  • arn3n 6 hours ago
    Wind gusts were reaching 125 MPH in Boulder county, if anyone’s curious. A lot of power was shut off preemptively to prevent downed power lines from starting wildfires. Energy providers gave warning to locals in advance. Shame that NIST’s backup generator failed, though.
    • Maxion 5 hours ago
      Somewhat interesting that they themselves don't have access to the site. You'd think there would have been some disaster plans put in place?
      • cornholio 2 hours ago
        The disater plan is to have a few dozens stratum 1 servers spread around the world, each connected to a distinct primary atomic clock, so that a catastrophic disaster needs to take down the global internet itself for all servers to become unreachable.

        The failure of a single such server is far from a disaster.

      • ssl-3 4 hours ago
        Maybe this is the disaster plan: There's not a smouldering hole where NIST's Boulder facility used to be, and it will be operational again soon enough.

        There's no present need for important hard-to-replace sciencey-dudes to go into the shop (which is probably both cold, and dark, and may have other problems that make it unsafe: it's deliberately closed) to futz around with the the time machines.

        We still have other NTP clocks. Spooky-accurate clocks that the public can get to, even, like just up the road at NIST in Fort Collins (where WWVB lives, and which is currently up), and in Maryland.

        This is just one set.

        And beyond that, we've also got clocks in GPS satellites orbiting, and a whole world of low-stratum NTP servers that distribute that time on the network. (I have one such GPS-backed NTP server on the shelf behind me; there's not much to it.)

        And the orbital GPS clocks are controlled by the US Navy, not NIST.

        So there's redundancy in distribution, and also control, and some of the clocks aren't even on the Earth.

        Some people may be bit by this if their systems rely on only one NTP server, or only on the subset of them that are down.

        And if we're following section 3.2 of RFC 8633 and using multiple diverse NTP sources for our important stuff, then this event (while certainly interesting!) is not presently an issue at all.

        • Balgair 2 hours ago
          There are many backup clocks/clusters that NIST uses as redundancies all around Boulder too, no need to even go up to Fort Collins. As in, NIST has fiber to a few at CU and a few commercial companies, last I checked. They're used in cases just like this one.

          Fun facts about The clock:

          You can't put anything in the room or take anything out. That's how sensitive the clock is.

          The room is just filled with asbestos.

          The actual port for the actual clock, the little metal thingy that is going buzz, buzz, buzz with voltage every second on the dot? Yeah, that little port isn't actually hooked up to anything, as again, it's so sensitive (impedance matching). So they use the other ports on the card for actual data transfer to the rest of the world. They do the adjustments so it's all fine in the end. But you have to define something as the second, and that little unused port is it.

          You can take a few pictures in the cramped little room, but you can't linger, as again, just your extra mass and gravity affects things fairly quickly.

          If there are more questions about time and timekeeping in general, go ahead and ask, though I'll probably get back to them a bit later today.

          • jrronimo 48 minutes ago
            I'm the Manager of the Computing group at JILA at CU, where utcnist*.colorado.edu used to be housed. Those machines were, for years, consistently the highest bandwidth usage computers on campus.

            Unfortunately, the HP cesium clock that backed the utcnist systems failed a few weeks ago, so they're offline. I believe the plan is to decommission those servers anyway - NIST doesn't even list them on the NTP status page anymore, and Judah Levine has retired (though he still comes in frequently). Judah told me in the past that the typical plan in this situation is that you reference a spare HP clock with the clock at NIST, then drive it over to JILA backed by some sort of battery and put it in the rack, then send in the broken one for refurb (~$20k-$40k; new box is closer to $75k). The same is true for the WWVB station, should its clocks fail.

            There is fiber that connects NIST to CU (it's part of the BRAN - Boulder Research and Administration Network). Typically that's used when comparing some of the new clocks at JILA (like Jun Ye's strontium clock) to NIST's reference. Fun fact: Some years back the group was noticing loss due to the fiber couplers in various closets between JILA & NIST... so they went to the closets and directly spliced the fibers to each other. It's now one single strand of fiber between JILA & NIST Boulder.

            That fiber wasn't connected to the clock that backed utcnist though. utcnist's clock was a commercial cesium clock box from HP that was also fed by GPS. This setup was not particularly sensitive to people being in the room or anything.

            Another fun fact: utcnist3 was an FPGA developed in-house to respond to NTP traffic. Super cool project, though I didn't have anything to do with it, haha.

          • Workaccount2 1 hour ago
            >The actual port for the actual clock, the little metal thingy that is going buzz, buzz, buzz with voltage every second on the dot? Yeah, that little port isn't actually hooked up to anything, as again, it's so sensitive (impedance matching). So they use the other ports on the card for actual data transfer to the rest of the world.

            Can you restate this part in full technical jargon along with more detail? I'm having a hard time following it

          • ronjakoi 1 hour ago
            Will the time it takes you to answer depend on the mass of the person asking?
        • mcculley 39 minutes ago
          > And the orbital GPS clocks are controlled by the US Navy, not NIST.

          I thought it was US Space Force / Air Force. Was the Navy previously or currently involved?

      • TylerE 5 hours ago
        Step One of most disaster plans is not to create a second emergency.
        • ronjakoi 1 hour ago
          Or even just a microsecond emergency.
        • amelius 4 hours ago
          But can't NTP server downtime cause a disaster?
          • idiotsecant 2 hours ago
            If your application is so critical that NTP timing loss causes disaster and your holdover fails in less than a day and you aren't generating your own via gps, you are incompetent, full stop
          • Vosporos 4 hours ago
            One (amongst many) NTP server going down creates less issues than an NTP server spreading wrong time.
            • macintux 4 hours ago
              General rule of thumb: a misbehaving/slow server in any well-architected distributed system is vastly worse than a dead server.
            • PunchyHamster 3 hours ago
              technically if you have 3 or more sources that would be caught; NTP protocol was designed for that eventuality
              • throw0101c 1 hour ago
                > technically if you have 3 or more sources that would be caught; NTP protocol was designed for that eventuality

                Either go with one clock in your NTPd/Chrony configuration, or ≥4.

                Yes, if you have 3 they can triangulate, but if one goes offline now you have 2 with no tie-breaker. If you have (at least) 4 servers, then one can go away and triangulation / sanity-checking can still occur with the 3 remaining.

                • ufocia 4 minutes ago
                  Your probably meant trilaterate.
              • da_chicken 3 hours ago
                Sure, but not needing a failure to cascade to yet another failsafe is still a good idea. After all, all software has bugs, and all networks have configuration errors.
    • IncreasePosts 2 hours ago
      Notably, we had the marshal fire here 4 years ago and recently Xcel settled for $680M for their role in the fire. So they're probably pretty keen not to be on the hook again
      • tpoindex 1 hour ago
        For more background on the Marshal Fire of Dec. 2021: https://en.wikipedia.org/wiki/Marshall_Fire

        tl;dr - the fire destroyed over 1,000 homes, two deaths. The local electrical utility, Xcel, was found as a contributing cause from sparking power lines during a strong wind storm. As a result, electrical utilities now cut power to affected areas during strong winds.

  • themafia 5 hours ago
    > Facility operators anticipated needing to shutdown the heat-exchange infrastructure providing air cooling to many parts of the building, including some internal networking closets. As a result, many of these too were preemptively shutdown with the result that our group lacks much of the monitoring and control capabilities we ordinarily have

    Having a parallel low bandwidth, low power, low waste heat network infrastructure for this suddenly seems useful.

  • glkindlmann 5 hours ago
    Of the various internet .+P, NTP is one I never learned about as a student, so now I'm looking at its web page [1] by its creator David L. Mills (1938-2024). I've found one video of him giving a retrospective of his extensive internet work; he talks about NTP at 34:51 [2] and later at 56:26 [3].

    [1] https://www.eecis.udel.edu/~mills/ntp.html

    [2] https://youtu.be/08jBmCvxkv4?si=WXJCV_v0qlZQK3m4&t=2092

    [3] https://youtu.be/08jBmCvxkv4?si=K80ThtYZWcOAxUga&t=3386

    • ssl-3 5 hours ago
      HN discussion shortly after Dave Mills died, early in 2024: https://news.ycombinator.com/item?id=39051246
    • torcete 5 hours ago
      In [3] he mentions that one can use NTP to observe frequency deviations and use it as an early warning system for fire and AC failure. That really intrigues me. Can you actually? Has this ever been implemented?
      • magicalhippo 2 hours ago
        Oscillators of all kinds are temperature dependent.

        That's why the most stable ones are insulated and ovenized[1].

        So an AC failure which would lead to higher room temperatures would lead to stronger or more frequent correction by the NTP client, as the local oscillator would drift more.

        Not sure about the fire case though. I mean the same applies there but I'm not imaginative enough to think of a realistic scenario where NTP would be useful for averting a fire.

        [1]: https://blog.bliley.com/anatomy-of-an-ocxo-oven-controlled-c...

      • eichin 2 hours ago
        I knew of some experiments in this space back in the late 1980s or early 1990s - but it was specifically with DECstation hardware that had terrible clocks (not used for alerting, just "this graphs nicely against temperature".) https://groups.csail.mit.edu/ana/Publications/PubPDFs/Greg.T... (PDF) 4.2.1 does talk about explaining local clock frequency changes with office temperature changes (because they overwhelm a clock-aging model) but it doesn't have graphs so perhaps they weren't clear enough to include (or just not relevant enough to Time Surveying.)
    • OptionOfT 1 hour ago
      What do you mean by +P?
      • black_knight 1 hour ago
        Presumably, .+P is regex referring too all the acronyms ending with P, that is Protocol. SMTP, HTTP, FTP, IMAP, XMPP…
      • rexreed 1 hour ago
        Looks like .+P is just a regex way of saying any set of characters / protocol ending in P.
      • juped 1 hour ago
        Nothing, as they said .+P (note the dot)
  • Animats 6 hours ago
    NIST campus status: Due to elevated fire risk and a power outage for the Boulder area, the DOC Boulder Labs campus is CLOSED on December 19 for onsite business and no public access is permitted; previously approved accesses are revoked.[1]

    WWV still seems to be up, including voice phone access.

    NIST Boulder has a recorded phone number for site status, and it says that as of December 20, the site is closed with no access.

    NIST's main web site says they put status info on various social media accounts, but there's no announcement about this.

    [1] https://www.nist.gov/campus-status

    • Scaevolus 48 minutes ago
      Note that WWV is a good 60 miles NNE of Boulder on the outskirts of Fort Collins.
  • sgnelson 2 hours ago
    One question I have is did DOGE decisions have anything to do with this? Because I know they took knives to NIST.
    • grepfru_it 1 hour ago
      Residents and some businesses of Boulder have been without power since Tuesday. There was an issue about 10 years ago which caused 1000 homes to burn down and the power company was found liable. They change their actions. Then during the next high wind event, the power company preemptively cut power and businesses sued them for loss of revenue. Now the power company is playing it safe and turning off power to residents and keeping downtown businesses powered.

      Maybe their generator failing was DOGE related, but wouldn’t have happened if state level shenanigans were better handled

      • _se 22 minutes ago
        The Marshall fire was 4 years ago. Almost to the day.
    • cramcgrab 2 hours ago
      Actually DOGE involvement at the highest level would have resulted in Tesla solar and Tesla powerwall battery backups.
      • throw0101c 1 hour ago
        > […] Tesla solar and Tesla powerwall battery backups.

        Don't forget Solar Roof.

      • the_gipsy 1 hour ago
        Any day now
      • redbush237 1 hour ago
        Lol, lmao.

        Some relevant DOGE’s effects:

        -time and frequency division director quit

        -NIST emergency management staff at least 50% vacant

        -NIST director of safety retired, and NIST safety was already understaffed when compared to DOE labs

        -NOAA emergency manager on the same Boulder campus laid off

        etc

  • cdfuller 6 hours ago
    Can anybody expand on the implications of this?

    Being unfamiliar with it, it's hard to tell if this is a minor blip that happens all the time, or if it's potentially a major issue that could cause cascading errors equal to the hype of Y2K.

    • autarch 6 hours ago
      Time travel is extremely dangerous right now. I highly recommend deferring time travel plans except for extreme temporal emergencies.
      • verzali 2 hours ago
        Uhh, here's the problem, I'm sort of stuck travelling into the future at a more or less constant rate. I don't know how to stop doing that...
        • pbhjpbhj 2 hours ago
          Unless you can stop, I'm afraid this will cause almost certain death.
        • lxgr 1 hour ago
          Have you tried asymptotically approaching the speed of light?
        • autarch 2 hours ago
          Just go to your local shop and buy some time brakes. That's the safest course of action until this is repaired.
        • layer8 2 hours ago
          I regret to inform you that as a consequence of that sustained time travel, your mind and body will be slowly deteriorating and you’ll sooner or later end up dead.
      • jeffrallen 5 hours ago
        Would traveling to the past in order to put in place a preemptive fix for this outage be wise or dangerous?

        Asking for a friend.

        • __MatrixMan__ 52 minutes ago
          I couldn't comment on the causal hazards but since time is currently having an outage they've got an improved shot at getting away with it. I say go for it.
        • ExoticPearTree 4 hours ago
          Tell your friend that this course of action failed, as us in the present are still experiencing issues.
          • autarch 4 hours ago
            Well, that's _this_ timeline. Other timelines never had an outage.
            • throwup238 3 hours ago
              Not with Terminator rules…
        • JadeNB 2 hours ago
          Safety not guaranteed.
      • fuzztester 5 hours ago
        Same for database transaction roll back and roll forward actions.

        And most enterprises, including banks, use databases.

        So by bad luck, you may get a couple of transactions reversed in order of time, such as a $20 debit incorrectly happening before a $10 credit, when your bank balance was only $10 prior to both those transactions. So your balance temporarily goes negative.

        Now imagine if all those amounts were ten thousand times higher ...

      • yawpitch 5 hours ago
        Define “extreme”?
    • jhart99 4 hours ago
      NIST maintains several time standards. Gaithersburg MD is still up and I assume Hawaii is as well. Other than potential damage to equipment from loss of power (turbo molecular vacuum pumps and oil diffusion pumps might end up failing in interesting ways if not shut down properly) it will just take some time for the clocks to be recalibrated against the other NIST standards.
      • joncrane 57 minutes ago
        Gaithersburg went down on December 8th, is there confirmation that it's fully functional again?
    • Animats 6 hours ago
      Google has their own fleet of atomic clocks and time servers. So does AWS. So does Microsoft. So does Ubuntu. They're not going to drift enough for months to cause trouble. So the Internet can ride through this, mostly.

      The main problem will be services that assume at least one of the NIST time servers is up. Somewhere, there's going to be something that won't work right when all the NIST NTP servers are down. But what?

      • guenthert 5 hours ago
        Ubuntu using atomic clocks would surprise me. Sure they could, but it's not obvious to me why they would spend $$$$ on such. More plausible to me seems that they would be using GPSDO as reference clocks (in this context, about as good as your own atomic clock), iff they were running their own time servers. Google finds only that they are using servers from the NTP Pool Project, which will be using a variety of reference clocks.

        If you have information on what they actually are using internally, please share.

        • puzzlingcaptcha 5 hours ago
          I think people have a wrong idea of what a modern atomic clock looks like. These are readily available commercially, Microchip for example will happily sell you hydrogen, cesium or rubidium atomic clocks. Hydrogen masers are rather unwieldy, but you can get a rubidium clock in a 1U format and cesium ones are not much bigger. I think their cesium freq standards are formerly a HP business they acquired.

          Example: https://www.microchip.com/en-us/products/clock-and-timing/co...

          • xorcist 4 hours ago
            It is also important to realize that an atomic clock will only give you a steady pulse. It will count seconds for you, and do so very accurately, but that is not the same as knowing what time it is.

            If you get a rubidium clock for your garage, you can sync it up with GPS to get an accurate-enough clock for your hobby NTP project, but large research institutions and their expensive contraptions are more elaborate to set up.

      • genidoi 6 hours ago
        Atomic clock non-expert here, what does having a fleet of atomic clocks entail and why would the hyperscalers bother?
        • Gabrys1 6 hours ago
          Having clocks synchronized between your servers is extremely useful. For example, having a guarantee that the timestamp of arrival of a packet (measured by the clock on the destination) is ALWAYS bigger than the timestamp recorded by the sender is a huge win, especially for things like database scaling.

          For this though you need to go beyond NTP into PTP which is still usually based on GPS time and atomic clocks

          • riedel 4 hours ago
            Actually interesting to think about what UTC actually means and there is seems to be no absolute source of truth [0]. I guess the worry is not that much about the NTP servers (for which people anyways should configure fail overs) but the clocks themselves.

            [0] https://www.septentrio.com/en/learn-more/insights/how-gps-br...

            • pbhjpbhj 1 hour ago
              Could you define an absolute source of truth based on extrinsic features. Something like taking an intrinsic time from atomic sources, pegged to an astronomic or celestial event; then a predicted astronomic event that would allow us to reconcile time in the future.

              It might be difficult to generate enough resolution in measurable events that we can predict accurately enough? Like, I'm guessing the start of a transit or alignment event? Maybe something like predicting the time at which a laser pulse will be returnable from a lunar reflector -- if we can do the prediction accurately enough then we can re-establish time back to the current fixed scale.

              I think I'm addressing an event that won't ever happen (all precise and accurate time sources are lost/perturbed), and if it does it won't be important to re-sync in this way. But you know...

        • synack 5 hours ago
          Spanner depends on having a time source with bounded error to maintain consistency. Google accomplishes this by having GPS and atomic clocks in several datacenters.

          https://static.googleusercontent.com/media/research.google.c...

          https://static.googleusercontent.com/media/research.google.c...

          • londons_explore 5 hours ago
            And more importantly, the tighter the time bound, the higher the performance, so more accurate clocks easily pay for themselves in other saved infrastructure costs to service the same number of users.
        • Youden 2 hours ago
          There's a lot of focus in this thread on the atomic clocks but in most datacenters, they're not actually that important and I'm dubious that the hyperscalers actually maintain a "fleet" of them, in the sense that there are hundreds or thousands of these clocks in their datacenters.

          The ultimate goal is usually to have a bunch of computers all around the world run synchronised to one clock, within some very small error bound. This enables fancy things like [0].

          Usually, this is achieved by having some master clock(s) for each datacenter, which distribute time to other servers using something like NTP or PTP. These clocks, like any other clock, need two things to be useful: an oscillator, to provide ticks, and something by which to set the clock.

          In standard off-the-shelf hardware, like the Intel E810 network card, you'll have an OXCO, like [1], with a GPS module. The OXCO provides the ticks, the GPS module provides a timestamp to set the clock with and a pulse for when to set it.

          As long as you have GPS reception, even this hardware is extremely accurate. The GPS module provides a new timestamp, potentially accurate to within single-digit nanoseconds ([2] datasheet), every second. These timestamps can be used to adjust the oscillator and/or how its ticks are interpreted, such that you maintain accuracy between the timestamps from GPS.

          The problem comes when you lose GPS. Once this happens, you become dependent on the accuracy of the oscillator. An OXCO like [1] can hold to within 1µs accuracy over 4 hours without any corrections but if you need better than that (either more time below 1µs, or more accurate than 1µs over the same time), you need a better oscillator.

          The best oscillators are atomic oscillators. [2] for example can maintain better than 200ns accuracy over 24h.

          So for a datacenter application, I think the main reason for an atomic clock is simply for retaining extreme accuracy in the event of an outage. For quite reasonable accuracy, a more affordable OXCO works perfectly well.

          [0]: https://docs.cloud.google.com/spanner/docs/true-time-externa...

          [1]: https://www.microchip.com/en-us/product/OX-221

          [2]: https://www.u-blox.com/en/product/zed-f9t-module

          [3]: https://www.microchip.com/en-us/products/clock-and-timing/co...

      • adastra22 6 hours ago
        I know this is HN, but the internet is pretty low on the list of things NIST time standards are important for.
        • willis936 6 hours ago
          But pretty high on the list that NIST NTP is important for (since it leaves the building through the internet).
          • adastra22 5 hours ago
            If NIST NTP goes down, the internet doesn’t go down. But atomic clocks drifting does upset many scientific experiments, which would effectively go down for the duration of the outage.
            • willis936 5 hours ago
              This is the reason GP listed out all the alternative robust NTP services that are GPS disciplined, freely available, and used as redundant sources by any responsible timekeeper.

              What atomic clocks are disciplined by NTP anyway? Local GPS disciplining is the standard. If you're using NTP you don't need precision or accuracy in your timekeeping.

            • szundi 5 hours ago
              [dead]
        • 2snakes 2 hours ago
          In a past job I set up at least 5 domain dns servers pointing at nist ntp…
        • _zoltan_ 6 hours ago
          could you list 3 things that you think are more important than the internet? (I know the internet is going to be fine; I just want to understand what you think ranks higher globally...)
          • adastra22 6 hours ago
            Mostly scientific stuff like astronomical observations — e.g. did this event observed at one telescope coincide with neutrinos detected at this other observatory.

            Note I didn’t say they are more important than the Internet. That’s a value judgement in any case. I said that NIST level 0 NTp servers are more important to these use cases than they are to the Internet.

            • misnome 5 hours ago
              All these use at least GPS for timing
              • adastra22 2 hours ago
                No, they don’t. GPS is orders of magnitude less reliable than the most up to date metric time synchronization over fixed topology fiber links.
                • misnome 1 hour ago
                  I wonder why we bothered building GPS signal waveguides into the bottom of a mine then. Clearly we should have consulted the experts of hacker news first.

                  Losing NTP for a day is going to affect fuck-all.

          • Izmaki 5 hours ago
            The ability for humankind to communicate across the entire globe at nearly 1/4 of the speed of light has drastically accelerated our technological advancement. There is no doubt that the internet is a HUGE addition to society.

            It's not super important when compared to basic needs like plumbing, food, electricity, medical assistance and other silly things we take for granted but are heavily dependent on. We all saw what happened to hospitals during the early stages of the COVID pandemic; we had plenty of internet and electricity but were struggling on the medical part. That was quite bad... I'm not sure if it's any worse if an entire country/continent lost access to the Internet. Quite a lot of our core infrastructure components in society rely on this. And a fair bit of it relies on a common understanding of what time "now" is.

          • makeitdouble 5 hours ago
            I think it wont be affected by this but on the top of my head:

            - GPS

            - industrial complex that synchronize operations (we could include trains)

            - telecoms in general (so a level higher than the internet)

      • axlee 2 hours ago
        Can't they point these dns records to working servers meanwhile to avoid degradation?
        • creatonez 2 hours ago
          My understanding is that people who connect specifically to the NIST ensemble in Boulder (often via a direct fiber hookup rather than using the internet) are doing so because they are running a scientific experiment that relies on that specific clock. When your use case is sensitive enough, it's not directly interchangable with other clocks.

          Everyone else is already connecting to load balanced services that rotate through many servers, or have set up their own load balancing / fallbacks. The mistakenly hardcoded configurations should probably be shaken loose anyways.

    • franklyworks 6 hours ago
      Time engineers are very paranoid. I expect large problems can't occur due to a single provider misbehaving.
    • 1970-01-01 2 hours ago
      >Can anybody expand on the implications of this?

      The answer is no. Anyone claiming this will have an impact on infrastructure has no evidence backing it up. Table top exercises at best.

    • meindnoch 1 hour ago
      Unix timestamp resets to zero.
    • cramcgrab 2 hours ago
      I’d say everybody moving off NIST boulder NTP.
    • ThrowawayTestr 6 hours ago
      If your computer was using it as your time server and you didn't have alternatives configured your clock my have drifted a few seconds.
      • Roark66 4 hours ago
        I never checked it, but how much a typical's pc/server's clock does actually drift over a week or a month? I always thought it's well under a second.
        • bhouston 2 hours ago
          Clocks do drift. Seconds a week is definitely possible. I think there are varying quality of internal clocks in electronic devices, and the cheaper the quality the more drift there is. I think small cheap microcontrollers can drift seconds per day.
        • soared 53 minutes ago
          I have an extremely cheap and extremely low power automatic cat feed - it’s been on 2 D batteries for 18 months. I just reset it after it had drifted 19 minutes, so about 1 minute a month, or 15 seconds a week!
        • 1970-01-01 2 hours ago
          I've seen some new ThinkPads lose a minute a month and others (the old ThinkPads) keep within a second of NTP over an entire year. It depends.
        • layer8 1 hour ago
          Several seconds per week is normal. Oscillator accuracy is roughly on the order of 10 PPM, which would correspond to 6 seconds per week.
  • sgnelson 2 hours ago
    FYI, this was posted a month ago when discussing thermal effects of clock drift. I thought it was quite interesting view of what the WWVB location looks like:

    https://jila.colorado.edu/news-events/articles/spare-time

    Discussed here: https://news.ycombinator.com/item?id=46042946

  • 8organicbits 1 hour ago
    The referenced mailing list is this Google Group (https://groups.google.com/a/list.nist.gov/g/internet-time-se...) which has some other posts about this incident.
  • amelius 4 hours ago
    This makes me wonder, if you take the average time of all wristwatches on the planet, accounting for timezones and throwing out outliers, how close would you get to NTP time?

    And how many randomly chosen wristwatches would you need to get anything reasonable?

    • throwup238 3 hours ago
      You’re the person Douglas Adams warned us about.
    • nielsole 3 hours ago
      I have a hunch my casio wrist watch is designed to be running a bit too quick to make resetting the seconds easier. Your averaging assumes manufacturers try to make their watches as accurate as possible for average conditions
      • amelius 2 hours ago
        I think it runs quick to be on the safe side, so you never miss appointments, trains, etc. because of your watch.

        But yes, good point.

      • raverbashing 2 hours ago
        This is the kind of thing that Casio designers would probably come up with (second to have as much accuracy as possible within their budget)

        Given two time changes per year I guess something like 1 min per year is acceptable

    • varjag 3 hours ago
      Close but unlikely to be precise in metrology sense. There's unlikely even a billion wrist watches being worn.
    • b112 3 hours ago
      One. One watch. POTUS's watch. And in fact, that's why Boulder is currently shuttered... they disagreed.
  • DamonHD 4 hours ago
    So far I think I'm still seeing one of them in my peers list for my public-ish NTP server:

             remote           refid      st t when poll reach   delay   offset  jitter
        ==============================================================================
        +time-e-b.nist.g .NIST.           1 u  372 1024  377  125.260    1.314   0.280
    • DamonHD 4 hours ago
      ...and maybe it's gone:

          #time-e-b.nist.g .NIST.           1 u 1071 1024  377  125.260    1.314   0.280
  • jb1991 18 minutes ago
    This is terrible. It’s always sad to hear about things like this.
  • gilrain 2 hours ago
    It’d be a good idea to protect our infrastructure from the climate we created.

    It’s just a good idea, though, not a greedy one… so it won’t happen.

  • crazydoggers 5 hours ago
    Status of NIST time servers:

    https://tf.nist.gov/tf-cgi/servers.cgi

  • lovich 6 hours ago
    This was an NTP 0 server right? What is the actual failback mechanism when that level of NTP server fails?

    This is some level of eldritch magic that I am aware of, but not familiar with but am interested in learning.

    • Maxious 5 hours ago
      There's two other sites for the time.nist.gov service so it'll be okay.

      Probably more interesting is how you get a tier 0 site back in sync - NIST rents out these cyberpunk looking units you can use to get your local frequency standards up to scratch for ~$700/month https://www.nist.gov/programs-projects/frequency-measurement...

      • wpm 1 hour ago
        I must have one of those units oh my god
      • lovich 4 hours ago
        What happens in the event all the sites for time.nist.gov go down? is it included in the spec?

        Also thank you for that link, this is exactly the kind of esoteric knowledge that I enjoy learning about

        • sdrmill 4 hours ago
          Most high-availability networks use pool.ntp.org or vendor-specific pools (e.g., time.cloudflare.com, time.google.com, time.windows.com). These systems would automatically switch to a surviving peer in the pool.

          Many data centers and telecom hubs use local GPS/GNSS-disciplined oscillators or atomic clocks and wouldn’t be affected.

          Most laptops, smartphones, tablets, etc. would be accurate enough for days before drift affected things for the most part.

          Kerberos requires clocks to be typically within 5 minutes to prevent replay attacks, so they’d probably be ok.

          Sysadmins would need to update hardcoded NTP configurations to point to secondary servers.

          If timestamps were REALLY off, TLS certificates might fail, but that’s highly unlikely.

          Databases could be corrupted due to failure of transaction ordering.

          Financial exchanges are often legally required to use time traceable to a national standard like UTC(NIST). A total failure of the NIST distribution layer could potentially trigger a suspension of electronic trading to maintain audit trail integrity.

          Modern power grids use Synchrophasors that require microsecond-level precision for frequency monitoring. Losing the NIST reference would degrade the grid's ability to respond to load fluctuations, increasing the risk of cascading outages.

          • neomantra 2 hours ago
            Great list! Just double-checked the CAT timekeeping requirements [1] and the requirement is NIST sync. So a subset of all UTC.

            You don’t need to actually sync to NIST. I think most people PTP/PPS to a GPS-connected Grandmaster with high quality crystals.

            But one must report deviations from NIST time, so CAT Reporters must track it.

            I think you are right — if there is no NIST time signal then there is no properly auditable trading and thus no trading. MFID has similar stuff but I am unfamiliar.

            One of my favorite nerd possessions is my hand-signed letter from Judah Levine with my NIST Authenticated NTP key.

            [1] https://www.finra.org/rules-guidance/rulebooks/finra-rules/6...

    • lambdaone 5 hours ago
      There are lots of Stratum 0 servers out there; basically anything with an atomic clock will do. They all count seconds independently from one another, all slowly diverging over time, with offset intervals being measured by mutual synchronization using a number of means (how is this done is interesting all by itself). Some atomic clocks are more accurate than others, and an ensemble of these is typically regarded as 'the' master clock.

      To quote the ITU: "UTC is based on about 450 atomic clocks, which are maintained in 85 national time laboratories around the world." https://www.itu.int/hub/2023/07/coordinated-universal-time-a...

      Beyond this, as other commenters have said, anyone who is really dependent on having exact time (such as telcos, broadcasters, and those running global synchronized databases) should have their own atomic clock fleets. There are thousands and thousands of atomic clocks in these fleets worldwide. Moreover, GPS time, used by many to act as their time reference, is distributed by yet other means.

      Nothing bad will happen, except to those who have deliberately made these specific Stratum 0 clocks their only reference time. Anyone who has either left their computer at its factory settings or has set up their NTP configuration in accordance to recommended settings will be unaffected by this.

  • keepamovin 3 hours ago
    For future reference of civilization: if a facility is critical, it must have a SMR.
  • qmarchi 6 hours ago
    Man, they're having a hell of a time up in Boulder.
  • renewiltord 7 hours ago
    Well, where did NTP at NIST last put it? Did they look there?
    • Y_Y 6 hours ago
      You misunderstand, there's been a coup
      • adastra22 6 hours ago
        Of course there is. Where else would they put the reference standard chickens?
      • renewiltord 6 hours ago
        We have to stop those knaves pushing PTP! NTP must prevail!