|
| 1 | +--- |
| 2 | +layout: "@/layouts/global.astro" |
| 3 | +title: "TLS and QUIC: A Masochist's Guide" |
| 4 | +author: kixelated |
| 5 | +description: Setting up TLS is a pain, but it's a requirement for HTTPS. QUIC and WebTransport introduce even more pain. We like pain, right? |
| 6 | +cover: "/blog/tls-and-quic/warning.png" |
| 7 | +date: 2025-07-28 |
| 8 | +--- |
| 9 | + |
| 10 | + |
| 11 | +# TLS and QUIC: A Masochist's Guide |
| 12 | +I hope you're having a great day. |
| 13 | +The sun is about to shine brighter. |
| 14 | +I was inspired by Helios himself to write about the riveting topic of TLS and QUIC. |
| 15 | + |
| 16 | +In my opinion, the most difficult part about QUIC is setting up a different protocol. |
| 17 | +QUIC *requires* TLS. |
| 18 | +There's no way to disable encryption and only some clients let you circumvent certificate validation. |
| 19 | +If you screw it up, you'll get a scary WARNING screen and users won't be able to connect. |
| 20 | + |
| 21 | +<figure> |
| 22 | +  |
| 23 | + <figcaption>I'm sick, so this is the only art you're getting today.</figcaption> |
| 24 | +</figure> |
| 25 | + |
| 26 | +Most of this guide applies to HTTPS in general, which makes sense as HTTP/3 uses QUIC. |
| 27 | +WebTransport too as it's layered on top of HTTP/3 but there are some important distinctions at the end... |
| 28 | + |
| 29 | +## tl;dr |
| 30 | +- **TLS authenticates** who is allowed to serve `example.com` via certificates. |
| 31 | +- **Root CAs** issue certificates to those who can prove it via DNS. |
| 32 | +- **Cloud providers** don't support (non-HTTP/3) QUIC yet, ruling out the easy options. |
| 33 | +- TLS is annoying for **local development**, _especially_ for WebTransport. |
| 34 | + |
| 35 | +## About Me |
| 36 | +I'm not a security engineer but I have dabbled in the low-level protocols. |
| 37 | +For some unbeknownst reason, I've implemented both DTLS 1.2 (for WebRTC) and TLS 1.3 (for QUIC)... in Go. |
| 38 | +Both were undoubtedly insecure but somehow passed the security audit and served production traffic at Twitch. |
| 39 | +But then I left and those servers rightfully got the `rm -rf` treatment. |
| 40 | + |
| 41 | +When in doubt, always refer to the nerds who take security seriously and use the correct terminology. |
| 42 | +I understand a lot of the security primitives but I don't exactly have the LinkedIn Professional Certificates to back it up. |
| 43 | + |
| 44 | + |
| 45 | +## Why TLS? |
| 46 | +*Some boring background, feel free to skip ahead.* |
| 47 | + |
| 48 | +TLS is a client-server protocol that is used to verify the identity of the server (and optionally the client: mTLS) before establishing an encrypted connection. |
| 49 | +When a client connects to `example.com`, the server transmits proof that it "owns" `example.com`. |
| 50 | +This proof is in the form of a TLS certificate (technically [X.509](https://en.wikipedia.org/wiki/X.509)) which can be used to "sign" stuff by solving a math problem that is super super difficult without access to a "private key". |
| 51 | +TLS certificates can be used to sign other TLS certificates creating a "chain of trust". |
| 52 | + |
| 53 | +Without TLS, an attacker could intercept your traffic and pretend to be `example.com` to harvest your credentials, called a "man-in-the-middle" (MITM) attack. |
| 54 | +Imagine if your router, ISP, or fellow coffee shop customer could pretend to be your bank. |
| 55 | +Bad times ahead. |
| 56 | +The fact that the private key is (virtually) unguessable is the only reason why I haven't taken out a second mortgage in your name. |
| 57 | + |
| 58 | +QUIC requires [TLS 1.3](https://datatracker.ietf.org/doc/html/rfc8446). |
| 59 | +It's good stuff, much better than TLS 1.2 in my opinion. |
| 60 | +There is no way to disable encryption but depending on the client, you can modify certificate validation. |
| 61 | +The IETF grey-beards decreed that the protocol can never be insecure, lest a lowly application developer shoot themselves in the foot. |
| 62 | + |
| 63 | +- HTTP/3, WebTransport, and of course, [Media over QUIC](https://quic.video) all use QUIC under the hood. |
| 64 | +- HTTP/2 *technically* does not require TLS but browsers require it as a forcing function. |
| 65 | +- HTTP/1 is the lone exception, allowing you to choose if you want to connect to `http://example.com` (insecure) or `https://example.com` (TLS). |
| 66 | + |
| 67 | +But even if you choose to use an insecure `http://` connection, it prevents you from using [newer browser APIs](https://developer.mozilla.org/en-US/docs/Web/Security/Secure_Contexts/features_restricted_to_secure_contexts). |
| 68 | +Want to notify a user when their Hot Pocket® has finished cooking? |
| 69 | +Then you need to use HTTPS, lest your Hot Pocket® Tracker™ get compromised by a state actor. |
| 70 | + |
| 71 | +So yeah, browser vendors don't want you to make a security oopsie whoopsie and thus, effectively mandate TLS. |
| 72 | +Suck it up and let some cloud provider handle TLS for you... unless you're one of the early birds using QUIC... |
| 73 | + |
| 74 | +*ominous foreshadowing* |
| 75 | + |
| 76 | + |
| 77 | +# Certificate Validation |
| 78 | +At a high level, a TLS connection is established via: |
| 79 | + |
| 80 | +1. The client connects to `example.com` and sends a `ClientHello`. |
| 81 | + - The client (usually) sends the domain via a [SNI extension](https://en.wikipedia.org/wiki/Server_Name_Indication). |
| 82 | +2. The server transmits a `ServerHello` along with a TLS certificate signed for `example.com`. |
| 83 | + - The server uses SNI to choose the certificate if it hosts multiple websites. |
| 84 | +3. The client verifies that the TLS certificate is for `example.com`. |
| 85 | + - It must have been signed by a "root CA", or a certificate signed by a "root CA", or a certificate signed by a certificate signed by a "root CA"... |
| 86 | + |
| 87 | + |
| 88 | +What is a root CA? |
| 89 | +Your browser and operating system ship with a list of trusted entities that are authorized to issue certificates. |
| 90 | +It's like having a list of approved locksmith companies that can make keys for any house. |
| 91 | +The list changes over time as these companies are audited or catastrophically compromised. |
| 92 | + |
| 93 | +*Fun fact*, this is why you may have to explicitly `apt update && apt install ca-certificates` when setting up a Docker image. |
| 94 | +Otherwise, if the base image contained certificate roots, it would get super stale. |
| 95 | + |
| 96 | +One of the interesting things about TLS is that the client can choose how to verify the provided TLS certificate. |
| 97 | +There's no requirement that you use these provided root CAs, or a root CA at all. |
| 98 | +You can make the protocol as secure/insecure as you want if you have enough control over the client. |
| 99 | + |
| 100 | +So in order to run a QUIC server on the internet, you need to get access to a certificate signed for a specific domain name (or IP address). |
| 101 | +If your server listens on `example.com`, then we need to prove to some root CA that you own `example.com` |
| 102 | + |
| 103 | + |
| 104 | +## Cloud Offerings |
| 105 | +The easiest option unfortunately doesn't work for QUIC. |
| 106 | +Poop. |
| 107 | + |
| 108 | +Virtually every cloud provider offers HTTPS support often via their own root CA. |
| 109 | +You point your domain name at their load balancer and they procure a TLS certificate. |
| 110 | +But I glossed over something, there's actually two parts to a "TLS certificate": the private key and the certificate. |
| 111 | + |
| 112 | +None of these services actually give you, the customer, the private key. |
| 113 | +Instead, they run a HTTPS or TLS load balancer that terminates TLS and proxies unencrypted traffic to your backend. |
| 114 | +It's done for simplicity and security, they don't want you having access to the keys. |
| 115 | + |
| 116 | +This load balancing approach won't work for QUIC until cloud providers start offering QUIC load balancers (one day). |
| 117 | +Welcome to UDP protocols; they're barely supported on cloud platforms because TCP and HTTP is so widespread. |
| 118 | +At least there's hope for QUIC support in the future because it powers HTTP/3. |
| 119 | + |
| 120 | + |
| 121 | +## LetsEncrypt |
| 122 | +The recommended path is to use the glorious [LetsEncrypt](https://letsencrypt.org/) to get a certificate. |
| 123 | +It's free, it's painless, and it's highly recommended by yours truly. |
| 124 | +There are paid offerings too, but there's really not much point using them since LetsEncrypt became a thing. |
| 125 | + |
| 126 | +How this works is that you need to somehow prove to LetsEncrypt that you own a specific domain and then they'll give you a certificate valid for 90 days. |
| 127 | +"owning" a domain in this context means you control the ability to add DNS records. |
| 128 | +This could be in the form of an A record that points to an IP address you own or a TXT record with some token. |
| 129 | + |
| 130 | +There are a few different [challenge types](https://letsencrypt.org/docs/challenge-types/): |
| 131 | +- *HTTP-01*: Host a HTTP endpoint (insecure) that returns a specified token on the specified path. You own the domain if it points to your HTTP server. |
| 132 | +- *DNS-01*: Create a DNS record with a specified token. You own the domain if you can create this TXT record. The server doesn't even have to be running. |
| 133 | +- *TLS-ALPN-01*: Host a TLS endpoint that returns a specified token during the TLS handshake. Same idea as HTTP-01, but more convenient for TLS load balancers. |
| 134 | + |
| 135 | +This cert generation can be automated via [certbot](https://certbot.eff.org/) or any [ACME library](https://letsencrypt.org/docs/client-options/#libraries). |
| 136 | +The main downside of LetsEncrypt is the 90 day expiration. |
| 137 | +That means you actually have to be a good developer and worry about certificate rotations instead of punting it to future you. |
| 138 | +That means adding a way to reload certificates, or periodically hitting the good old `restart` button. |
| 139 | + |
| 140 | +## ACME |
| 141 | +It would be remiss of me to not mention that LetsEncrypt uses [ACME v2](https://en.wikipedia.org/wiki/Automatic_Certificate_Management_Environment). |
| 142 | +You don't need to use `certbot` and in fact you could integrate ACME directly into your workflow. |
| 143 | + |
| 144 | +In fact, I'm using a [terraform module](https://github.com/vancluever/terraform-provider-acme) to generate certificates for `relay.quic.video`. |
| 145 | +It's **not recommended** because if I forget to run terraform every so often, then oops, my certificate will expire and users can no longer connect to my server. |
| 146 | +But it's easy and it works for now so I'm embracing my folly. |
| 147 | + |
| 148 | +I'd like to add [ACME support](https://crates.io/crates/instant-acme) within `moq-relay` itself to provision a TLS certificate on startup and automatically refresh it. |
| 149 | +This makes it a lot easier to use cloud providers as you don't need to fumble with background services (in Docker, ew) and reloading the server whenever a new certificate is generated. |
| 150 | +The downside is being unable to generate wildcard certificates as those require DNS challenges. |
| 151 | + |
| 152 | + |
| 153 | +# Private Networks |
| 154 | +Anyway, we've talked a lot about the internet, but what about the intranet? |
| 155 | + |
| 156 | +The problem with a service like LetsEncrypt is that it requires our server to be public. |
| 157 | +What if we're running our own private network or just developing an application locally? |
| 158 | +If LetsEncrypt can't connect to our private network, then it can't give us a TLS certificate. |
| 159 | + |
| 160 | +Additionally, LetsEncrypt requires a domain name or a public IP. |
| 161 | +These aren't expensive but it's not something you can reasonably ask developers to purchase just to try running your code. |
| 162 | + |
| 163 | +Remember when I said the client was responsible for verifying a certificate however it sees fit? |
| 164 | +If we control the QUIC client, then we can change that behavior. |
| 165 | + |
| 166 | +**NOTE**: None of this currently applies to WebTransport as we don't have enough control of the browser client. |
| 167 | +See the next section! |
| 168 | + |
| 169 | + |
| 170 | +## Disable Verification |
| 171 | +The most obvious thing the client can do is skip verification altogether. |
| 172 | +There's usually a **DANGER** warning associated and **DANGER** indeed. |
| 173 | + |
| 174 | +If you skip certificate validation, your connection will still be encrypted but now it's vulnerable to a MITM attack. |
| 175 | +The server still has to present a certificate, but the client will blindly accept *any* certificate. |
| 176 | +Even if it was generated nanoseconds ago, is riddled with typos, and claims to be `porhnub.com`: doesn't matter. |
| 177 | + |
| 178 | +`moq-relay` supports the [--tls-disable-verify](https://github.com/kixelated/moq/blob/becf50263488ded6bb26c4cbc3d1ffd14ab11f5b/rs/moq-native/src/client.rs#L22) by [jumping through a few hoops with rustls](https://github.com/kixelated/moq/blob/becf50263488ded6bb26c4cbc3d1ffd14ab11f5b/rs/moq-native/src/client.rs#L202). |
| 179 | +There's similar flags in most CLI tools, like curl's `--insecure` flag. |
| 180 | + |
| 181 | + |
| 182 | +## Custom Root CAs |
| 183 | +Instead of using a "trusted" root CA that ships with the browser or operating system, we can use our own. |
| 184 | +It's trivial to generate root CA which can then be used to sign certificates. |
| 185 | +No more depending on FAANG, we are our own auditor now. |
| 186 | + |
| 187 | +``` |
| 188 | +openssl req -new -x509 -days 365 -nodes -out ca-cert.pem -keyout ca-key.pem -subj "/C=US/ST=CA/L=San Francisco/O=CoolCidsClub/OU=DaveLaptop/CN=cool.bro" |
| 189 | +``` |
| 190 | + |
| 191 | +This is arguably safer than using public roots because our own client can be configured to ONLY accept our root CA. |
| 192 | +It doesn't matter if one of the many public CAs gets hacked as long as our root is kept under lock and key. |
| 193 | +But note that you're now responsible for stuff like revoking certificates if you truly care about the securities. |
| 194 | + |
| 195 | +The catch is that we need to configure clients to trust our root CA. |
| 196 | +This normally requires admin/root access, and can be done at an operating system or browser level. |
| 197 | +But you only have to do it once per root CA and then you're golden. |
| 198 | + |
| 199 | +This is the secret behind a tool like [mkcert](https://github.com/FiloSottile/mkcert). |
| 200 | +It's a tool that allows you to use `https` in local development seemingly via black magic. |
| 201 | +The first time you run it, `mkcert` generates a root CA and adds it to the system and browser's trusted roots. |
| 202 | +Afterwards, you can freely generate new leaf certificates on demand (without root) that are automatically trusted. |
| 203 | + |
| 204 | +Custom roots are often used in enterprise and VPN software. |
| 205 | +When you install the VPN, or as part of a managed IT solution, it'll include some root CAs for you to trust. |
| 206 | + |
| 207 | +**Side note:** I highly recommend using private CAs and mTLS for service-to-service connections. |
| 208 | +It's about as `dank` as one can get when designing distributed systems. |
| 209 | +There are cloud offerings available ([AWS](https://aws.amazon.com/private-ca/)) and they actually give you the private key, so it works for QUIC. |
| 210 | + |
| 211 | + |
| 212 | +## Certificate Hashes |
| 213 | +WebRTC also uses TLS (technically DTLS) even when establishing a peer-to-peer connection. |
| 214 | +How does this work? |
| 215 | + |
| 216 | +Both peers generate a ECDSA certificate (or RSA I guess) and compute its SHA256 hash. |
| 217 | +They then send the hash as part of the SDP exchange to some secure middle-man (usually a HTTPS server). |
| 218 | +Yes, you do need a server even when establishing a peer-to-peer connection unless you're a freak who exchanges TLS certificates via USB drive. |
| 219 | + |
| 220 | +The peers draw straws and one of them assumes the role of the ~bottom~ server for the TLS handshake. |
| 221 | +Both sides transmit a TLS certificate (mTLS) and verify that the hash matches the exchanged hash. |
| 222 | +Ta-da, connection established, ignoring all of the ICE shenanigans. |
| 223 | + |
| 224 | +The same approach can be used for QUIC both peer-to-peer and client-to-server. |
| 225 | +We're effectively just trusting individual certificates (by hash) instead of a root CA. |
| 226 | +Just like root CAs, you **need** to secure the transfer of trusted certificates otherwise you're vulnerable to MITM attacks. |
| 227 | + |
| 228 | +Here's some rustls configuration that [validates certificates based on hash](https://github.com/kixelated/web-transport-rs/blob/3e656ca4e89c60c6c3b45fda6e4c67db7c9b2ec2/web-transport-quinn/src/client.rs#L232). |
| 229 | +It's not the prettiest code but it works. |
| 230 | + |
| 231 | + |
| 232 | +# WebTransport |
| 233 | +Unfortunately, Chrome's implementation of WebTransport leaves a lot to be desired. |
| 234 | +Rant incoming, grab some popcorn. |
| 235 | + |
| 236 | +**NOTE**: Firefox is spared from this rant because I haven't tested it. |
| 237 | +Safari is spared because they haven't implemented WebTransport yet... |
| 238 | + |
| 239 | + |
| 240 | +## Disable Certificate Validation |
| 241 | +There's a Chrome flag that apparently lets you disable certificate validation for WebTransport: [chrome://flags/#webtransport-developer-mode](chrome://flags/#webtransport-developer-mode) |
| 242 | +>When enabled, removes the requirement that all certificates used for WebTransport over HTTP/3 are issued by a known certificate root. – Mac, Windows, Linux, ChromeOS, Android |
| 243 | +
|
| 244 | +If the description is to be trusted, this would mean disabling certificate validation on every website (that uses WebTransport) which is just a horrific thought. |
| 245 | +This is the equivalent to silently disabling `https` on every website via a benign developer flag. |
| 246 | +I sincerely hope that the description is just wrong and this only applies to `localhost` or something; somebody should test it. |
| 247 | + |
| 248 | +If you're having trouble with the TLS handshake then absolutely turn it on and **don't forget to turn it off afterwards**. |
| 249 | +Not many sites use WebTransport right now but it would be super awkward when they do. |
| 250 | + |
| 251 | + |
| 252 | +## Custom Roots |
| 253 | +Chrome currently doesn't support custom root CAs for WebTransport. |
| 254 | +I've reported the issue multiple times to the WebTransport developer but it's apparently by design? |
| 255 | + |
| 256 | +It's quite baffling, because you can use custom roots for HTTP/3 but not WebTransport... which uses HTTP/3. |
| 257 | +There's literally no reason why it should use different certificate validation logic. |
| 258 | +Just call the same function! |
| 259 | + |
| 260 | +I classify this as a bug because it rules out tools like `mkcert`. |
| 261 | +Local development and private networks need to use another approach. |
| 262 | + |
| 263 | + |
| 264 | +## serverCertificateHashes |
| 265 | +There was this awkward "Certificate Hashes" section earlier talking about WebRTC. |
| 266 | +That's because WebTransport supports [providing a list of certificate hashes](https://developer.mozilla.org/en-US/docs/Web/API/WebTransport/WebTransport#servercertificatehashes) for a similar approach. |
| 267 | + |
| 268 | +Unfortunately, there are some strings attached. |
| 269 | +The certificates MUST be valid for less than 14 days and MUST use ECDSA. |
| 270 | +Apparently 2 weeks is the sweet spot between "secure" and "annoying as fuck". |
| 271 | + |
| 272 | +These are good restrictions so you can't be lazy and ship the hash of some long-lived certificate with your application. |
| 273 | +However it means we absolutely need to figure out how to rotate certificates because 14 days is not a lot of time. |
| 274 | +Additionally, we need a secure mechanism to transmit our certificate hashes otherwise we're the major of MITM town. |
| 275 | + |
| 276 | +## Private Networks |
| 277 | +So what's the best approach if you want to use WebTransport on localhost or private networks? |
| 278 | +Unfortunately, I think `serverCertificateHashes` is the best (right now) as it doesn't require users to configure their browser and disable TLS... |
| 279 | + |
| 280 | +`moq-relay` listens on TCP and UDP (:443 by default). |
| 281 | +- The server [generates a TLS certificate](https://github.com/kixelated/moq/blob/becf50263488ded6bb26c4cbc3d1ffd14ab11f5b/rs/moq-native/src/server.rs#L241) on startup. |
| 282 | +- The client [fetches the certificate hash](https://github.com/kixelated/moq/blob/becf50263488ded6bb26c4cbc3d1ffd14ab11f5b/js/moq/src/lite/connection.ts#L225) via a HTTP [/certificate.sha256](https://github.com/kixelated/moq/blob/becf50263488ded6bb26c4cbc3d1ffd14ab11f5b/rs/moq-relay/src/web.rs#L44) endpoint. |
| 283 | +- The client then [connects to the WebTransport server](https://github.com/kixelated/moq/blob/becf50263488ded6bb26c4cbc3d1ffd14ab11f5b/js/moq/src/lite/connection.ts#L231) using the provided hash. |
| 284 | + |
| 285 | +When connecting to `localhost`, the certificate fetch can use good old HTTP. |
| 286 | +But if you want to use WebTransport to connect to any other private network, oof. |
| 287 | +The web server will need to use HTTPS to serve the certificate hash. |
| 288 | + |
| 289 | +What this means is that you're establishing a TLS connection just to establish another TLS connection. |
| 290 | +In fact, you could use an **identical** certificate for both the HTTPS and WebTransport connections. |
| 291 | +But now you have to deal with 14 day certificate rotations, all because Chrome doesn't support custom root CAs. |
| 292 | + |
| 293 | +It's not a major problem once you figure it out. |
| 294 | +It's just frustrating. |
| 295 | + |
| 296 | + |
| 297 | +# Finished |
| 298 | +TLS is not too bad in production once you realize it's all about proving that you own a domain. |
| 299 | +There's a lot of existing tooling and resources out there. |
| 300 | + |
| 301 | +But it's a pain in the butt for non-public servers, as the whole "chain of trust" thing doesn't work any longer. |
| 302 | +WebTransport makes life even more difficult. |
| 303 | +Please Mr Google, add support for custom root CAs already, it should be like a single line of code to reuse the same CAs as HTTP. |
| 304 | + |
| 305 | +Want to commiserate about TLS pain? |
| 306 | +Join the [MoQ Discord](https://discord.gg/FCYF3p99mr) or even the [rustls Discord](https://discord.gg/MCSB76RU96). |
| 307 | + |
| 308 | +Written by [@kixelated](https://github.com/kixelated). |
| 309 | + |
0 commit comments