LetsEncrypt Outage

(letsencrypt.status.io)

157 points | by kenshaw 6 hours ago

17 comments

Kholin 4 hours ago
Let's Encrypt stopped its certificate expiration email notification service a while ago, and I hadn't found a replacement yet. As a result, I didn't receive an expiration notice this time and failed to renew my certificate in advance. The certificate expired today, making my website inaccessible. I logged into my VPS to renew it manually, but the process failed every time. I then checked my cloud provider's platform and saw a notification at the top, which made me realize the problem was with the certificate provider. A quick look at Hacker News confirmed it: Let's Encrypt was having an outage. I want to post this news on my website, but I can't, because my site is down due to the expired certificate.
[-]
- mixdup 4 hours ago
  They have been communicating the ending of the email notices for quite a while and have been telling users that you should have some other monitoring in place to avoid just this situation
  [-]
  - andrewmcwatters 3 hours ago
    Yes, but what’s weird is the recommended service they referred people to for new email notifications was not… sending me emails.
    So, what gives?
    [-]
    - kevincox 3 hours ago
      Yeah the recommended service is awful and not nearly as useful as the one they had is.
      Which is disappointing because you should be able to recreate the service they had nearly exactly with certificate transparency logs.
  - CamperBob2 4 hours ago
    Also, beware of the leopard.
    [-]
    - jojobas 3 hours ago
      If you didn't see their sunset notification emails you wouldn't have seen your cert expiration email either.
- Jedd 2 hours ago
  > Let's Encrypt stopped its certificate expiration email notification service a while ago, and I hadn't found a replacement yet.
  This sounds like an easy problem to identify root cause for.
  I think I received about 15 'we're disabling email notifications soon' emails over the past several months - one of which was interesting, but none were needed, as I'd originally set this up, per documentation, to auto-renew every 30 days.
  Perhaps create a calendar reminder for the short term?
- tbrownaw 3 hours ago
  Haven't they always, from day one, insisted that their primary goal was to encourage (force) automation of certificate maintenance, as a mechanism to make tls ubiquitous (mandatory everywhere)?
- compumike 4 hours ago
  Oof, you're right, that's rough that it's so soon after they discontinued their email service!
  I wrote this blog post a few weeks ago: "Minimal, cron-ready scripts in Bash, Python, Ruby, Node.js (JavaScript), Go, and Powershell to check when your website's SSL certificate expires." https://heiioncall.com/blog/barebone-scripts-to-check-ssl-ce... which may be helpful if you want to roll your own.
  (Disclosure: at Heii On-Call we also offer free SSL certificate expiration monitoring, among other things.)
- mattbillenstein 3 hours ago
  I run https://ismycertexpired.com/ - you can sign up for email alerts.
- yjftsjthsd-h 4 hours ago
  > As a result, I didn't receive an expiration notice this time and failed to renew my certificate in advance.
  Shouldn't that happen automatically a bit beforehand?
  [-]
  - Kholin 4 hours ago
    Due to some legacy reasons, my service runs using a docker + nginx setup. However, certbot was initially used in its native nginx mode to generate the certificate, which prevented it from auto-renewing. I later switched it to standalone mode, but I'm not sure if I configured the auto-renewal correctly. In any case, the certificate happened to expire today, and it didn't renew automatically. On a side note, I was actually planning to see what an expired website certificate looked like first and then deal with the auto-renewal issue. After all, it's just a small hobby website, so it's not that big of a deal.
    [-]
    - dylan604 3 hours ago
      that sounds like a you're holding it wrong type of a situation to me. a major point of Let's Encrypt (besides the obvious free) is that it deliberately keeps the cert times short to avoid the "someone that no longer works here set this up two years ago" type of situation with certbot checking twice a day and updating when necessary. so to break what Let's Encrypt is doing with not using certbot definitely feels like you're holding it wrong
- mr_toad 3 hours ago
  Isn’t the recommended practice to update every ~60 days or so, regardless?
- rfv6723 2 hours ago
  I use self-hosted gatus to monitor my certs and other services' status.
  It can send alerts to multiple alerting providers.
  https://github.com/TwiN/gatus
  [-]
  - hoherd 1 hour ago
    I use uptime-kuma[1] with notifications sent out through the included Apprise integration[2]
    1. https://github.com/louislam/uptime-kuma
    2. https://github.com/caronc/apprise
- superkuh 1 hour ago
  If it's a personal website you should consider HTTP+HTTPS. It offers the best of both worlds and your website would always be accessible even if some third party CA is not (or if there's some local issue, or if the HTTP client connecting has cert issues). MITM attacks on personal websites are extremely, extremely rare.
greyface- 4 hours ago
Good time to note that Buypass offers free certificates over ACME. I have a few of my domains configured to use them instead of LetsEncrypt, just for redundancy and to ensure I have a working non-LE cert source in case LE suffers problems like this over a longer time period.
Example OpenBSD /etc/acme-client.conf:
```
  authority buypass {
   api url "https://api.buypass.com/acme/directory"
   account key "/etc/acme/buypass-privkey.pem"
   contact "mailto:youremail@example.com"
  }
  domain example.com {
   domain key "/etc/ssl/private/example.com.key"
   domain full chain certificate "/etc/ssl/example.com.pem"
   sign with buypass
  }
```
[-]
- ninjin 3 hours ago
  Cheers! They look like decent chaps and also outside the US for some additional certificate diversity. Are there other trustworthy Acme issuers out there?
  A pity that acme-client(1) does not allow for fallbacks, but I will add a mental note about it being an easy enough patch to contribute if I ever find the time.
- CGamesPlay 3 hours ago
  This is neat. Does cert-manager have facilities to automatically use a fallback ACME provider, so I could automate using this? I'd also accept a pool of ACME providers, but a priority ordering seems ideal. I don't see the functionality listed anywhere, maybe there's some security argument that this is a bad idea?
pgporada 4 hours ago
It's DNS, we're working on it. Sorry, thank you for bearing with us.
[-]
- Titan2189 4 hours ago
  It's not DNS
  There's no way it's DNS
  It was DNS
  [-]
  - deadbabe 4 hours ago
    Five stages of DNS outage:
    1. Denial: It’s not DNS.
    2. Anger: What the fuck is it!
    3. Bargaining: Maybe it’s a firewall, or Cloudflare!
    4. Depression: We’ve checked everything…
    5. Acceptance: It’s DNS.
    [-]
    - trinsic2 1 hour ago
      LOL. I just went though this the other day. my site was intermittently non-accessible. DNS was the last thing I thought it was until I ran a crawler on my site and spotted some 404 errors. Found that my non-www. url was pointed at the wrong IP and I forgot to update it when I transfered my domain to a new host.
- woleium 4 hours ago
  That ttl is a killer, eh?
- senectus1 3 hours ago
  whoa whoa whoa.. slow down! you dont just leap to "It's DNS"... you have to try to blame everything else first before you get to DNS. it's like foreplay!
  [-]
  - dylan604 3 hours ago
    when all of the interns have jumped around the corner before the blame hammer was wielded, you have to move to the next item on the list
- adamcharnock 3 hours ago
  It's always either DNS or MTU.
  (Or, as I recently encountered, it can also be a McAfee corporate firewall trying to be helpful by showing a download progress bar in place of an HTTP SSE stream. I was sure that was being caused by MTU, but alas no.)
jasonthorsness 4 hours ago
Mostly this should be a non-event due to renewal long before expiration? Although huge deal I suppose for services that require issuing new certifications constantly; Let's Encrypt would be major failure mode for them.
As they move to shorter-lifetime certs (6 days now https://letsencrypt.org/2025/01/16/6-day-and-ip-certs/?utm_s...) this puts it in the realm of possibility that an incident could impact long-running services.
[-]
- kenshaw 4 hours ago
  I encountered this while trying to issue a new certificate for a service. As a temporary fix, started using ZeroSSL which conveniently also supports the ACME protocol. While not a big problem, if you have something like `cert-manager` being used on Kubernetes, then it requires quite a bit of reconfiguration, and you may spend a couple hours trying to figure out why a certificate hasn't been issued yet.
  That said, I'm unbelievably grateful for the great product (and work!) LetsEncrypt has provided for free. Hope they're able to get their infrastructure back up soon.
  [-]
  - jasonthorsness 4 hours ago
    Let's Encrypt was a huge deal right from the beginning. They truly moved the web forward.
    Here is the HN announcement: https://news.ycombinator.com/item?id=8624160
    Announcement "animated" https://hn.unlurker.com/replay?item=8624160
    [-]
    - VenturingVole 2 hours ago
      Truly was a radical advancement. Makes me wonder, a decade from now what will it be that we look back upon with a similar perspective?
sugarpimpdorsey 4 hours ago
I'm sure those six-day lifetime certificates will work out real nice.
[-]
- ocdtrekkie 4 hours ago
  I think I am going to become a fan of shorter certificate lifetimes because as soon as the chuckleheads in the CAB truly break the Internet on the level they are pushing for, the sooner we get to discard the entire PKI dumpster fire.
  [-]
  - NooneAtAll3 4 hours ago
    what's the alternative to PKI?
    [-]
    - greyface- 3 hours ago
      https://en.wikipedia.org/wiki/Decentralized_identifier
      [-]
      - jtchang 2 hours ago
        So basically you trust something because you have a long chain of assurances that you trusted it before? Kinda like certificate pinning.
    - dylan604 3 hours ago
      no alternative. just eliminate the thing not liked. it's called being DOGEd
    - haiku2077 3 hours ago
      Certainly something a hell of a lot simpler then x509 - and without assumptions from the 1990s hardcoded into it
phillipseamore 4 hours ago
Alternatives are available:
https://zerossl.com/ (90 days)
https://www.buypass.com/ (180 days)
colinbartlett 4 hours ago
We're seeing a lot of downstream effects of this at StatusGator. Of course any provider that relies on LetsEncrypt to issue certs (such as Heroku) is affected.
One notable exception is Cloudflare: They famously no longer rely solely on LetsEncrypt.
benlivengood 3 hours ago
Hopefully the thundering herd when service is restored doesn't knock things offline again. I know LE designs for huge throughput (something like 3X total outstanding certificates in 24 hours, at one point) and the automated client recommendations for backoff are pretty good, but there will be a lot of manual applications/renewals I'm sure.
bethekidyouwant 5 hours ago
I hope no one was migrating infra EOD West Coast
[-]
- jaeh 4 hours ago
  can't be it, it's not friday.
keysdev 4 hours ago
Shall we have some way of freely encrypting the web that is relying on one authority?
Especially something that needed to be renewed every 90 or is it 40 days now. How about issuing 100 years certificates as a default?
[-]
- RiverCrochet 3 hours ago
  Long expiration times = compromised certs that hang around longer than they should. It's bad.
  Note that you can make your own self-signed CA certificate, create any server and client certificates you want signed with that CA cert, and deploy them whenever and wherever you want. Of course you want the root CA private key securely put somewhere and all that stuff.
  The only reason it won't work at large without a bit of friction is because your CA cert isn't in the default trusted root store of major browsers (phone and PC). It's easy enough to add it - it does pop up warnings and such on Windows, Android, iOS and hopefully Mac OS X, but they're necessary here.
  No, it's not going to let the whole world do TLS with you warning-free without doing some sort of work, but for small scales (the type that Let's Encrypt is often used for anyway) it's fine.
- kyrra 4 hours ago
  Many of the cloud providers give free certs via acme.
  https://cloud.google.com/certificate-manager/docs/public-ca-... (EDIT: Google is their own CA, with https://pki.goog/ )
  The browsers and security people have been pushing towards shorter certs, not longer ones. Knowing how to rotate a cert every year, if not shorter, helps when your certificate or any of your parent certs are compromised and require an emergency rotation.
  [-]
  - CGamesPlay 3 hours ago
    Does AWS provide something similar? I found ACM "exportable certificates", but that involves AWS managing your private key.
    [-]
    - schoen 3 hours ago
      Last I knew, AWS would issue a free certificate to people using certain AWS services, but, as you say, only if Amazon is managing the private key. You can also use ACM APIs to import keys and certificates from other CAs.
- Marsymars 4 hours ago
  > Shall we have some way of freely encrypting the web that is relying on one authority?
  Caddy uses ZeroSSL as a fallback if Let’s Encrypt fails!
  [-]
  - gregsadetsky 4 hours ago
    But it's not on by default, right..? (i.e. is there a particular config needed for that?)
    I'm using Caddy here and it's not falling back on ZeroSSL. Thanks for your help
    EDIT: hmm, it should be automatic...! https://caddyserver.com/docs/automatic-https#issuer-fallback interesting, I'll double check my config
    woah... it's probably related to this! https://github.com/caddyserver/caddy/issues/7084 TLDR: "Caddy doesn't fall back to ZeroSSL for domains added using API" (which is my case)
- dwood_dev 4 hours ago
  This is largely not an issue thanks to ACME which they spearheaded. You can use multiple providers as backup options.
  Also, you have days to weeks of slack time for renewals. The only real impact is trying to issue new certs if you are solely dependent on LE.
- Dylan16807 4 hours ago
  Revocation doesn't work well, so we're simplifying and relying on expiration for that. So no to the super long certs.
- 0xbadcafebee 4 hours ago
  The bigger question that's going unasked: what the hell is the point of an expiration date if it keeps getting shorter? At some point we will refresh the cert every second.
  The whole point of the expiration is in case a hacker gets the private key to the cert and can then MITM, they can keep MITMing successfully until the cert the hacker gives to the clients expires (or was revoked by something like OCSP, assuming the client verifies OCSP). A very long expiration is very bad because it means the hacker could keep MITMing for years.
  The way things like this work with modern security is ephemeral security tokens. Your program starts and it requests a security token, and it refreshes the token over X time (within 24 hrs). If a hacker gets the token, they can attack using it until 1) you notice and revoke the existing tokens AND sessions, or 2) the token expires (and we assume they for some reason don't have an advanced persistent threat in place).
  Nobody puts any emphasis on the fact that 1) you have to NOTICE THE ATTACK AND REVOKE SHIT for any of these expirations to have any impact on security whatsoever, and 2) if they got the private key once, they can probably get it again after it expires, UNLESS YOU NOTICE AND PLUG THE HOLE. If you have nothing in place to notice a hacker has your private key, and if revocation isn't effective, the impact is exactly the same whether expiration is 1 second or 1 year.
  How many people are running security scans on their whole stack every day? How many are patching security holes within a week? How many have advanced software designed to find rootkits and other exploits? Or any other measure to detect active attacks? My guess is maybe 0.0001% of you do. So you will never know when they gain access to your certs, so the fast expiration is mostly pointless.
  We should be completely reinventing the whole protocol to be a token-based authorization service, because that's where it's headed. And we should be focusing more on mitigating active exploits rather than just hoping nobody ever exploits anything. But that would scare people, or require additional work. So instead we let like 3 companies slowly do whatever they want with the entire web in an uncoordinated way. And because we let them do whatever they want with the web, they keep introducing more failure modes and things get shittier. We are enabling the enshittification happening in front of our eyes.
  [-]
  - schoen 3 hours ago
    The other benefit of expiration dates in a PKI is in case the subject information is no longer accurate.
    In old-school X.509 PKI this might be "in case this person is no longer affiliated with the issuer" (for organizational PKI) or "in case this contact information for this person is otherwise no longer accurate".
    In web PKI this might be "in case this person no longer controls this domain name" or "in case this person no longer controls this IP address".
    The key-compromise issue you mention was more urgent for the web PKI before TLS routinely used ciphersuites providing forward secrecy. In that case, a private key compromise would allow the attacker to passively decrypt all TLS sessions during the lifetime of that private key. With more modern ciphersuites, a private key compromise allows the attacker to actively impersonate an endpoint for future sessions during the lifetime of that private key. This is comparatively much less catastrophic.
  - tialaramex 3 hours ago
    > The whole point of the expiration is in case a hacker gets the private key to the cert and can then MITM
    Nope. So all that happened here is that you were wrong.
- XorNot 4 hours ago
  You've always been able to do this. Whether its useful to your clients has always been the problem.
  In a practical sense you likely wouldn't like the alternatives, because for most people's usage of the internet there's exactly one authority which matters: the local government, and it's legal system - i.e. most of my necessary use of TLS is for ecommerce. Which means the ultimate authority is "are you a trusted business entity in the local jurisdiction?"
  Very few people would have any reason to ever expand the definition beyond this, and less would have the knowledge to do so safely even if we provided the interfaces - i.e. no one knows what safety numbers in Signal mean, if I can even get them to use Signal.
  [-]
  - progmetaldev 4 hours ago
    Maybe I'm misinterpreting this, but local government's legal system is not the "one authority which matters." What local government is able to keep up to date on TLS certificates?
    Your users that visit your website and get a TLS warning are the authority to worry about, if you're running a business that needs security. Depending on what you're selling, that one user could be a gigantic chunk of your business. Showing your local government that you have a process in place to renew your TLS certificates, and your provider was down is most likely going to be more than enough to indemnify you for any kind of maliciousness or ignorance (ignorantia juris non excusat). Obviously, different countries/locations have varying laws, but I highly doubt you'd be held liable for such a major outage for a company that is in such heavy use. Honestly, if you were held liable, or think you would be for this type of event, I'd think twice about operating from that location.
wnevets 4 hours ago
This is the first time I remember something like this ever happening with LetsEncrypt
phillipseamore 4 hours ago
I want DANE!
[-]
- rocqua 4 hours ago
  That ship has sailed. DNSsec is not liked even a little bit. Given that control over DNS is how domain validated certs are handed out, it would make a lot of sense to cut out the middle man.
  But DNS does not have a good reliable authenticated transport mechanism. I wonder if there was a way to build this that would have worked.
  [-]
  - phillipseamore 3 hours ago
    My biggest problem is how centralized issuance is.
    Half the year I live on an island that is reliant on submarine cables and has historically had weeks and months long outages and with a changing world I suspect that might become reality once again. Locally this wasn't much of an issue, the ccTLD continues to function, most services (but now about 35%) are locally hosted. Then HTTPS comes along. Zero certificates could be (re-)issued during an outage. A locally run CA isn't really an option (standalone simply isn't feasible and getting into root stores takes time and money), so you are left with teaching users to ignore certificate errors a few weeks into an extended outage.
    I could see someone like LE working with TLD registrars to enable local issuance (with delegated/sub-CA certificates restricted to the TLD), that could also mitigate problems like today (decentralize issuance) and the registrars are already the primary source of truth for DV validation.
    [-]
    - ocdtrekkie 2 hours ago
      Realistically there's no reason except Google retaining centralized control of the Internet for there to be a specific group of trusted CAs that meet Google's arcane specifications which can issue certificates the entire world trusts.
      Your registrar should be able to validate your ownership of the domain, ergo your registrar should be your CA. Instead of a bunch of arbitrary and capricious rules to be trusted, a CA should not be "trusted" by the browser, but only able to sign certificates for domains registered to it.
      [-]
      - phillipseamore 1 hour ago
        s/Google/Apple, Google, Microsoft, and Mozilla/
adamsiem 4 hours ago
I thought I got rate-limited. Bad timing to spin up a new service.
rob_c 5 hours ago
Did the LLM delete this as well?
[-]
- burnte 4 hours ago
  Either that or someone took Ambien last night, that seems to make people do crazy mistakes. ;)
- 88j88 4 hours ago
  The response he received had a correction to the code that the user did not expect.
skluug 4 hours ago
LetsNotEncrypt. zing!