nixos/acme: improve scalability - reduce superfluous unit activations

The previous setup caused all renewal units to be triggered upon
ever so slight changes in config. In larger setups (100+ certificates)
adding a new certificate caused high system load and/or large memory
consumption issues. The memory issues are already a alleviated with
the locking mechanism. However, this then causes long delays upwards
of multiple minutes depending on individual runs and also caused
superfluous activations.

In this change we streamline the overall setup of units:

1. The unit that other services can depend upon is 'acme-{cert}.service'.
We call this the 'base unit'. As this one as `RemainAfterExit` set
the `acme-finished-{cert}` targets are not required any longer.

2. We now always generate initial self-signed certificates to simplify
the dependency structure. This deprecates the `preliminarySelfsigned`
option.

3. The `acme-order-renew-{cert}` service gets activated after the base
unit and services using certificates have started and performs all acme
interactions. When it finishes others services (like web servers) will
be notified through the `reloadServices` option or they can use
`wantedBy` and `after` dependencies if they implement their own reload
units.

The renewal timer also triggers this unit.

4. The timer unit is explicitly blocked from being started by s-t-c.

5. Permission management has been cleaned up a bit: there was an
   inconsistency between having the .lego files set to 600 vs 640
   on the exposed side. This is unified to 640 now.

6. Exempt the account target from being restarted by s-t-c. This will
   happen automatically if something relevant to the account changes.
This commit is contained in:
Christian Theune
2025-08-08 16:28:42 +02:00
parent dfe6a41c36
commit 2d0a489125
14 changed files with 382 additions and 278 deletions

View File

@@ -48,8 +48,6 @@ let
) (filter (hostOpts: hostOpts.enableACME || hostOpts.useACMEHost != null) vhosts);
vhostCertNames = unique (map (hostOpts: hostOpts.certName) acmeEnabledVhosts);
dependentCertNames = filter (cert: certs.${cert}.dnsProvider == null) vhostCertNames; # those that might depend on the HTTP server
independentCertNames = filter (cert: certs.${cert}.dnsProvider != null) vhostCertNames; # those that don't depend on the HTTP server
mkListenInfo =
hostOpts:
@@ -914,13 +912,14 @@ in
systemd.services.httpd = {
description = "Apache HTTPD";
wantedBy = [ "multi-user.target" ];
wants = concatLists (map (certName: [ "acme-finished-${certName}.target" ]) vhostCertNames);
wants = concatLists (map (certName: [ "acme-${certName}.service" ]) vhostCertNames);
after = [
"network.target"
]
++ map (certName: "acme-selfsigned-${certName}.service") vhostCertNames
++ map (certName: "acme-${certName}.service") independentCertNames; # avoid loading self-signed key w/ real cert, or vice-versa
before = map (certName: "acme-${certName}.service") dependentCertNames;
# Ensure httpd runs with baseline certificates in place.
++ map (certName: "acme-${certName}.service") vhostCertNames;
# Ensure httpd runs (with current config) before the actual ACME jobs run
before = map (certName: "acme-order-renew-${certName}.service") vhostCertNames;
restartTriggers = [ cfg.configFile ];
path = [
@@ -960,19 +959,17 @@ in
# postRun hooks on cert renew can't be used to restart Apache since renewal
# runs as the unprivileged acme user. sslTargets are added to wantedBy + before
# which allows the acme-finished-$cert.target to signify the successful updating
# which allows the acme-order-renew-$cert.service to signify the successful updating
# of certs end-to-end.
systemd.services.httpd-config-reload =
let
sslServices = map (certName: "acme-${certName}.service") vhostCertNames;
sslTargets = map (certName: "acme-finished-${certName}.target") vhostCertNames;
sslServices = map (certName: "acme-order-renew-${certName}.service") vhostCertNames;
in
mkIf (vhostCertNames != [ ]) {
wantedBy = sslServices ++ [ "multi-user.target" ];
# Before the finished targets, after the renew services.
# This service might be needed for HTTP-01 challenges, but we only want to confirm
# certs are updated _after_ config has been reloaded.
before = sslTargets;
after = sslServices;
restartTriggers = [ cfg.configFile ];
# Block reloading if not all certs exist yet.