Let's Encrypt stopped its certificate expiration email notification service a while ago, and I hadn't found a replacement yet. As a result, I didn't receive an expiration notice this time and failed to renew my certificate in advance. The certificate expired today, making my website inaccessible. I logged into my VPS to renew it manually, but the process failed every time. I then checked my cloud provider's platform and saw a notification at the top, which made me realize the problem was with the certificate provider. A quick look at Hacker News confirmed it: Let's Encrypt was having an outage. I want to post this news on my website, but I can't, because my site is down due to the expired certificate.
They have been communicating the ending of the email notices for quite a while and have been telling users that you should have some other monitoring in place to avoid just this situation
> Let's Encrypt stopped its certificate expiration email notification service a while ago, and I hadn't found a replacement yet.
This sounds like an easy problem to identify root cause for.
I think I received about 15 'we're disabling email notifications soon' emails over the past several months - one of which was interesting, but none were needed, as I'd originally set this up, per documentation, to auto-renew every 30 days.
Perhaps create a calendar reminder for the short term?
Haven't they always, from day one, insisted that their primary goal was to encourage (force) automation of certificate maintenance, as a mechanism to make tls ubiquitous (mandatory everywhere)?
Oof, you're right, that's rough that it's so soon after they discontinued their email service!
I wrote this blog post a few weeks ago: "Minimal, cron-ready scripts in Bash, Python, Ruby, Node.js (JavaScript), Go, and Powershell to check when your website's SSL certificate expires." https://heiioncall.com/blog/barebone-scripts-to-check-ssl-ce... which may be helpful if you want to roll your own.
(Disclosure: at Heii On-Call we also offer free SSL certificate expiration monitoring, among other things.)
Due to some legacy reasons, my service runs using a docker + nginx setup. However, certbot was initially used in its native nginx mode to generate the certificate, which prevented it from auto-renewing. I later switched it to standalone mode, but I'm not sure if I configured the auto-renewal correctly. In any case, the certificate happened to expire today, and it didn't renew automatically. On a side note, I was actually planning to see what an expired website certificate looked like first and then deal with the auto-renewal issue. After all, it's just a small hobby website, so it's not that big of a deal.
that sounds like a you're holding it wrong type of a situation to me. a major point of Let's Encrypt (besides the obvious free) is that it deliberately keeps the cert times short to avoid the "someone that no longer works here set this up two years ago" type of situation with certbot checking twice a day and updating when necessary. so to break what Let's Encrypt is doing with not using certbot definitely feels like you're holding it wrong
If it's a personal website you should consider HTTP+HTTPS. It offers the best of both worlds and your website would always be accessible even if some third party CA is not (or if there's some local issue, or if the HTTP client connecting has cert issues). MITM attacks on personal websites are extremely, extremely rare.
Good time to note that Buypass offers free certificates over ACME. I have a few of my domains configured to use them instead of LetsEncrypt, just for redundancy and to ensure I have a working non-LE cert source in case LE suffers problems like this over a longer time period.
Example OpenBSD /etc/acme-client.conf:
authority buypass {
api url "https://api.buypass.com/acme/directory"
account key "/etc/acme/buypass-privkey.pem"
contact "mailto:youremail@example.com"
}
domain example.com {
domain key "/etc/ssl/private/example.com.key"
domain full chain certificate "/etc/ssl/example.com.pem"
sign with buypass
}
Cheers! They look like decent chaps and also outside the US for some additional certificate diversity. Are there other trustworthy Acme issuers out there?
A pity that acme-client(1) does not allow for fallbacks, but I will add a mental note about it being an easy enough patch to contribute if I ever find the time.
This is neat. Does cert-manager have facilities to automatically use a fallback ACME provider, so I could automate using this? I'd also accept a pool of ACME providers, but a priority ordering seems ideal. I don't see the functionality listed anywhere, maybe there's some security argument that this is a bad idea?
LOL. I just went though this the other day. my site was intermittently non-accessible. DNS was the last thing I thought it was until I ran a crawler on my site and spotted some 404 errors. Found that my non-www. url was pointed at the wrong IP and I forgot to update it when I transfered my domain to a new host.
whoa whoa whoa.. slow down! you dont just leap to "It's DNS"... you have to try to blame everything else first before you get to DNS. it's like foreplay!
(Or, as I recently encountered, it can also be a McAfee corporate firewall trying to be helpful by showing a download progress bar in place of an HTTP SSE stream. I was sure that was being caused by MTU, but alas no.)
Mostly this should be a non-event due to renewal long before expiration? Although huge deal I suppose for services that require issuing new certifications constantly; Let's Encrypt would be major failure mode for them.
I encountered this while trying to issue a new certificate for a service. As a temporary fix, started using ZeroSSL which conveniently also supports the ACME protocol. While not a big problem, if you have something like `cert-manager` being used on Kubernetes, then it requires quite a bit of reconfiguration, and you may spend a couple hours trying to figure out why a certificate hasn't been issued yet.
That said, I'm unbelievably grateful for the great product (and work!) LetsEncrypt has provided for free. Hope they're able to get their infrastructure back up soon.
I think I am going to become a fan of shorter certificate lifetimes because as soon as the chuckleheads in the CAB truly break the Internet on the level they are pushing for, the sooner we get to discard the entire PKI dumpster fire.
We're seeing a lot of downstream effects of this at StatusGator. Of course any provider that relies on LetsEncrypt to issue certs (such as Heroku) is affected.
One notable exception is Cloudflare: They famously no longer rely solely on LetsEncrypt.
Hopefully the thundering herd when service is restored doesn't knock things offline again. I know LE designs for huge throughput (something like 3X total outstanding certificates in 24 hours, at one point) and the automated client recommendations for backoff are pretty good, but there will be a lot of manual applications/renewals I'm sure.
Long expiration times = compromised certs that hang around longer than they should. It's bad.
Note that you can make your own self-signed CA certificate, create any server and client certificates you want signed with that CA cert, and deploy them whenever and wherever you want. Of course you want the root CA private key securely put somewhere and all that stuff.
The only reason it won't work at large without a bit of friction is because your CA cert isn't in the default trusted root store of major browsers (phone and PC). It's easy enough to add it - it does pop up warnings and such on Windows, Android, iOS and hopefully Mac OS X, but they're necessary here.
No, it's not going to let the whole world do TLS with you warning-free without doing some sort of work, but for small scales (the type that Let's Encrypt is often used for anyway) it's fine.
The browsers and security people have been pushing towards shorter certs, not longer ones. Knowing how to rotate a cert every year, if not shorter, helps when your certificate or any of your parent certs are compromised and require an emergency rotation.
Last I knew, AWS would issue a free certificate to people using certain AWS services, but, as you say, only if Amazon is managing the private key. You can also use ACM APIs to import keys and certificates from other CAs.
The bigger question that's going unasked: what the hell is the point of an expiration date if it keeps getting shorter? At some point we will refresh the cert every second.
The whole point of the expiration is in case a hacker gets the private key to the cert and can then MITM, they can keep MITMing successfully until the cert the hacker gives to the clients expires (or was revoked by something like OCSP, assuming the client verifies OCSP). A very long expiration is very bad because it means the hacker could keep MITMing for years.
The way things like this work with modern security is ephemeral security tokens. Your program starts and it requests a security token, and it refreshes the token over X time (within 24 hrs). If a hacker gets the token, they can attack using it until 1) you notice and revoke the existing tokens AND sessions, or 2) the token expires (and we assume they for some reason don't have an advanced persistent threat in place).
Nobody puts any emphasis on the fact that 1) you have to NOTICE THE ATTACK AND REVOKE SHIT for any of these expirations to have any impact on security whatsoever, and 2) if they got the private key once, they can probably get it again after it expires, UNLESS YOU NOTICE AND PLUG THE HOLE. If you have nothing in place to notice a hacker has your private key, and if revocation isn't effective, the impact is exactly the same whether expiration is 1 second or 1 year.
How many people are running security scans on their whole stack every day? How many are patching security holes within a week? How many have advanced software designed to find rootkits and other exploits? Or any other measure to detect active attacks? My guess is maybe 0.0001% of you do. So you will never know when they gain access to your certs, so the fast expiration is mostly pointless.
We should be completely reinventing the whole protocol to be a token-based authorization service, because that's where it's headed. And we should be focusing more on mitigating active exploits rather than just hoping nobody ever exploits anything. But that would scare people, or require additional work. So instead we let like 3 companies slowly do whatever they want with the entire web in an uncoordinated way. And because we let them do whatever they want with the web, they keep introducing more failure modes and things get shittier. We are enabling the enshittification happening in front of our eyes.
The other benefit of expiration dates in a PKI is in case the subject information is no longer accurate.
In old-school X.509 PKI this might be "in case this person is no longer affiliated with the issuer" (for organizational PKI) or "in case this contact information for this person is otherwise no longer accurate".
In web PKI this might be "in case this person no longer controls this domain name" or "in case this person no longer controls this IP address".
The key-compromise issue you mention was more urgent for the web PKI before TLS routinely used ciphersuites providing forward secrecy. In that case, a private key compromise would allow the attacker to passively decrypt all TLS sessions during the lifetime of that private key. With more modern ciphersuites, a private key compromise allows the attacker to actively impersonate an endpoint for future sessions during the lifetime of that private key. This is comparatively much less catastrophic.
You've always been able to do this. Whether its useful to your clients has always been the problem.
In a practical sense you likely wouldn't like the alternatives, because for most people's usage of the internet there's exactly one authority which matters: the local government, and it's legal system - i.e. most of my necessary use of TLS is for ecommerce. Which means the ultimate authority is "are you a trusted business entity in the local jurisdiction?"
Very few people would have any reason to ever expand the definition beyond this, and less would have the knowledge to do so safely even if we provided the interfaces - i.e. no one knows what safety numbers in Signal mean, if I can even get them to use Signal.
Maybe I'm misinterpreting this, but local government's legal system is not the "one authority which matters." What local government is able to keep up to date on TLS certificates?
Your users that visit your website and get a TLS warning are the authority to worry about, if you're running a business that needs security. Depending on what you're selling, that one user could be a gigantic chunk of your business. Showing your local government that you have a process in place to renew your TLS certificates, and your provider was down is most likely going to be more than enough to indemnify you for any kind of maliciousness or ignorance (ignorantia juris non excusat). Obviously, different countries/locations have varying laws, but I highly doubt you'd be held liable for such a major outage for a company that is in such heavy use. Honestly, if you were held liable, or think you would be for this type of event, I'd think twice about operating from that location.
That ship has sailed. DNSsec is not liked even a little bit.
Given that control over DNS is how domain validated certs are handed out, it would make a lot of sense to cut out the middle man.
But DNS does not have a good reliable authenticated transport mechanism. I wonder if there was a way to build this that would have worked.
My biggest problem is how centralized issuance is.
Half the year I live on an island that is reliant on submarine cables and has historically had weeks and months long outages and with a changing world I suspect that might become reality once again. Locally this wasn't much of an issue, the ccTLD continues to function, most services (but now about 35%) are locally hosted. Then HTTPS comes along. Zero certificates could be (re-)issued during an outage. A locally run CA isn't really an option (standalone simply isn't feasible and getting into root stores takes time and money), so you are left with teaching users to ignore certificate errors a few weeks into an extended outage.
I could see someone like LE working with TLD registrars to enable local issuance (with delegated/sub-CA certificates restricted to the TLD), that could also mitigate problems like today (decentralize issuance) and the registrars are already the primary source of truth for DV validation.
Realistically there's no reason except Google retaining centralized control of the Internet for there to be a specific group of trusted CAs that meet Google's arcane specifications which can issue certificates the entire world trusts.
Your registrar should be able to validate your ownership of the domain, ergo your registrar should be your CA. Instead of a bunch of arbitrary and capricious rules to be trusted, a CA should not be "trusted" by the browser, but only able to sign certificates for domains registered to it.
So, what gives?
Which is disappointing because you should be able to recreate the service they had nearly exactly with certificate transparency logs.
This sounds like an easy problem to identify root cause for.
I think I received about 15 'we're disabling email notifications soon' emails over the past several months - one of which was interesting, but none were needed, as I'd originally set this up, per documentation, to auto-renew every 30 days.
Perhaps create a calendar reminder for the short term?
I wrote this blog post a few weeks ago: "Minimal, cron-ready scripts in Bash, Python, Ruby, Node.js (JavaScript), Go, and Powershell to check when your website's SSL certificate expires." https://heiioncall.com/blog/barebone-scripts-to-check-ssl-ce... which may be helpful if you want to roll your own.
(Disclosure: at Heii On-Call we also offer free SSL certificate expiration monitoring, among other things.)
Shouldn't that happen automatically a bit beforehand?
It can send alerts to multiple alerting providers.
https://github.com/TwiN/gatus
1. https://github.com/louislam/uptime-kuma
2. https://github.com/caronc/apprise
Example OpenBSD /etc/acme-client.conf:
A pity that acme-client(1) does not allow for fallbacks, but I will add a mental note about it being an easy enough patch to contribute if I ever find the time.
There's no way it's DNS
It was DNS
1. Denial: It’s not DNS.
2. Anger: What the fuck is it!
3. Bargaining: Maybe it’s a firewall, or Cloudflare!
4. Depression: We’ve checked everything…
5. Acceptance: It’s DNS.
(Or, as I recently encountered, it can also be a McAfee corporate firewall trying to be helpful by showing a download progress bar in place of an HTTP SSE stream. I was sure that was being caused by MTU, but alas no.)
As they move to shorter-lifetime certs (6 days now https://letsencrypt.org/2025/01/16/6-day-and-ip-certs/?utm_s...) this puts it in the realm of possibility that an incident could impact long-running services.
That said, I'm unbelievably grateful for the great product (and work!) LetsEncrypt has provided for free. Hope they're able to get their infrastructure back up soon.
Here is the HN announcement: https://news.ycombinator.com/item?id=8624160
Announcement "animated" https://hn.unlurker.com/replay?item=8624160
https://zerossl.com/ (90 days)
https://www.buypass.com/ (180 days)
One notable exception is Cloudflare: They famously no longer rely solely on LetsEncrypt.
Especially something that needed to be renewed every 90 or is it 40 days now. How about issuing 100 years certificates as a default?
Note that you can make your own self-signed CA certificate, create any server and client certificates you want signed with that CA cert, and deploy them whenever and wherever you want. Of course you want the root CA private key securely put somewhere and all that stuff.
The only reason it won't work at large without a bit of friction is because your CA cert isn't in the default trusted root store of major browsers (phone and PC). It's easy enough to add it - it does pop up warnings and such on Windows, Android, iOS and hopefully Mac OS X, but they're necessary here.
No, it's not going to let the whole world do TLS with you warning-free without doing some sort of work, but for small scales (the type that Let's Encrypt is often used for anyway) it's fine.
https://cloud.google.com/certificate-manager/docs/public-ca-... (EDIT: Google is their own CA, with https://pki.goog/ )
The browsers and security people have been pushing towards shorter certs, not longer ones. Knowing how to rotate a cert every year, if not shorter, helps when your certificate or any of your parent certs are compromised and require an emergency rotation.
Caddy uses ZeroSSL as a fallback if Let’s Encrypt fails!
I'm using Caddy here and it's not falling back on ZeroSSL. Thanks for your help
EDIT: hmm, it should be automatic...! https://caddyserver.com/docs/automatic-https#issuer-fallback interesting, I'll double check my config
woah... it's probably related to this! https://github.com/caddyserver/caddy/issues/7084 TLDR: "Caddy doesn't fall back to ZeroSSL for domains added using API" (which is my case)
Also, you have days to weeks of slack time for renewals. The only real impact is trying to issue new certs if you are solely dependent on LE.
The whole point of the expiration is in case a hacker gets the private key to the cert and can then MITM, they can keep MITMing successfully until the cert the hacker gives to the clients expires (or was revoked by something like OCSP, assuming the client verifies OCSP). A very long expiration is very bad because it means the hacker could keep MITMing for years.
The way things like this work with modern security is ephemeral security tokens. Your program starts and it requests a security token, and it refreshes the token over X time (within 24 hrs). If a hacker gets the token, they can attack using it until 1) you notice and revoke the existing tokens AND sessions, or 2) the token expires (and we assume they for some reason don't have an advanced persistent threat in place).
Nobody puts any emphasis on the fact that 1) you have to NOTICE THE ATTACK AND REVOKE SHIT for any of these expirations to have any impact on security whatsoever, and 2) if they got the private key once, they can probably get it again after it expires, UNLESS YOU NOTICE AND PLUG THE HOLE. If you have nothing in place to notice a hacker has your private key, and if revocation isn't effective, the impact is exactly the same whether expiration is 1 second or 1 year.
How many people are running security scans on their whole stack every day? How many are patching security holes within a week? How many have advanced software designed to find rootkits and other exploits? Or any other measure to detect active attacks? My guess is maybe 0.0001% of you do. So you will never know when they gain access to your certs, so the fast expiration is mostly pointless.
We should be completely reinventing the whole protocol to be a token-based authorization service, because that's where it's headed. And we should be focusing more on mitigating active exploits rather than just hoping nobody ever exploits anything. But that would scare people, or require additional work. So instead we let like 3 companies slowly do whatever they want with the entire web in an uncoordinated way. And because we let them do whatever they want with the web, they keep introducing more failure modes and things get shittier. We are enabling the enshittification happening in front of our eyes.
In old-school X.509 PKI this might be "in case this person is no longer affiliated with the issuer" (for organizational PKI) or "in case this contact information for this person is otherwise no longer accurate".
In web PKI this might be "in case this person no longer controls this domain name" or "in case this person no longer controls this IP address".
The key-compromise issue you mention was more urgent for the web PKI before TLS routinely used ciphersuites providing forward secrecy. In that case, a private key compromise would allow the attacker to passively decrypt all TLS sessions during the lifetime of that private key. With more modern ciphersuites, a private key compromise allows the attacker to actively impersonate an endpoint for future sessions during the lifetime of that private key. This is comparatively much less catastrophic.
Nope. So all that happened here is that you were wrong.
In a practical sense you likely wouldn't like the alternatives, because for most people's usage of the internet there's exactly one authority which matters: the local government, and it's legal system - i.e. most of my necessary use of TLS is for ecommerce. Which means the ultimate authority is "are you a trusted business entity in the local jurisdiction?"
Very few people would have any reason to ever expand the definition beyond this, and less would have the knowledge to do so safely even if we provided the interfaces - i.e. no one knows what safety numbers in Signal mean, if I can even get them to use Signal.
Your users that visit your website and get a TLS warning are the authority to worry about, if you're running a business that needs security. Depending on what you're selling, that one user could be a gigantic chunk of your business. Showing your local government that you have a process in place to renew your TLS certificates, and your provider was down is most likely going to be more than enough to indemnify you for any kind of maliciousness or ignorance (ignorantia juris non excusat). Obviously, different countries/locations have varying laws, but I highly doubt you'd be held liable for such a major outage for a company that is in such heavy use. Honestly, if you were held liable, or think you would be for this type of event, I'd think twice about operating from that location.
But DNS does not have a good reliable authenticated transport mechanism. I wonder if there was a way to build this that would have worked.
Half the year I live on an island that is reliant on submarine cables and has historically had weeks and months long outages and with a changing world I suspect that might become reality once again. Locally this wasn't much of an issue, the ccTLD continues to function, most services (but now about 35%) are locally hosted. Then HTTPS comes along. Zero certificates could be (re-)issued during an outage. A locally run CA isn't really an option (standalone simply isn't feasible and getting into root stores takes time and money), so you are left with teaching users to ignore certificate errors a few weeks into an extended outage.
I could see someone like LE working with TLD registrars to enable local issuance (with delegated/sub-CA certificates restricted to the TLD), that could also mitigate problems like today (decentralize issuance) and the registrars are already the primary source of truth for DV validation.
Your registrar should be able to validate your ownership of the domain, ergo your registrar should be your CA. Instead of a bunch of arbitrary and capricious rules to be trusted, a CA should not be "trusted" by the browser, but only able to sign certificates for domains registered to it.