Skip to content

[DoH][TLS Client Authentication] Reload expired client certificates #2114

@mjdiffy

Description

@mjdiffy

Output of the following commands:

# Others are less relevant here
➜  dnscrypt-proxy-macos_x86_64-2.1.1 ./dnscrypt-proxy -version
2.1.1

What is affected by this bug?

I have a DoH client server which only uses short-lived client certificates (2 week validity) when connecting to a mutually authenticated DoH server. These certificates are renewed weekly, replacing the existing file(s). However, since dnscrypt-proxy does not refresh its client credentials, all DoH requests fail due to TLS errors once the client certificate has expired, provided the upstream is enforcing valid client certificates. This means that my client only has between 1-2 weeks of use before I need to restart the dnscrypt-proxy process to pick up the new certificate.

When does this occur?

Upon startup, dnscrypt-proxy loads the client certificate into memory when building the xtransport. However, they are never updated after that. Should the certificate in memory expire, all requests which use it will fail. There are a few places where the transport is rebuilt, but none trigger on this particular error (only on "handshake failure").

Where does it happen?

It happens when a DoH request is being performed while the client certificate loaded into memory has expired.
The relevant error handling will likely take place here:

if err != nil {
dlog.Debugf("[%s]: [%s]", req.URL, err)
if xTransport.tlsCipherSuite != nil && strings.Contains(err.Error(), "handshake failure") {
dlog.Warnf(
"TLS handshake failure - Try changing or deleting the tls_cipher_suite value in the configuration file",
)
xTransport.tlsCipherSuite = nil
xTransport.rebuildTransport()
}
return nil, statusCode, nil, rtt, err
}

How do we replicate the issue?

  1. Generate a short-lived certificate (as short as a few minutes, if you'd like)
  2. Start dnscrypt-proxy with this client certificate, targeting a DoH server which supports mutual authentication and enforces certificate validity
  3. Generate a replacement certificate, with a later expiration date
  4. Wait until the original certificate has expired
  5. All requests should now fail, with an error similar to the following found in the (verbose) logs:
[2022-05-13 16:47:19] [DEBUG] [https://<redacted>/dns-query?dns=<redacted>]: [Get "https://<redacted>/dns-query?dns=<redacted>": remote error: tls: bad certificate]

Expected behavior (i.e. solution)

There are likely (at least) two ways to go about solving this issue:

  1. (Harder) Monitor the certificate files, reloading them when they update
  2. (Easier) Rebuild the transport upon "bad certificate", similar to what is done today when "handshake failure" is encountered

Other Comments

As a stopgap, I have updated my DoH endpoints to accept expired client certificates for now, but this is obviously not the ideal end state here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions