mTLS for AI Proxies: Building Zero-Trust at the Process Layer

A governance proxy that mediates every outbound AI request becomes two things at once: a policy enforcement point, and a concentration of plaintext prompts. The second property makes the first property dangerous. If the proxy's authentication is weak, the proxy is the easiest way to exfiltrate everything it was installed to protect.

This post covers the cryptographic design we use at Themisto Labs to prevent that. It is the part of the architecture we think the category has under-engineered, and it is the part that breaks first under realistic threat models.

Two adversaries

Any on-device AI governance proxy has to defeat two distinct adversaries at the same time.

The evasive user wants to send AI traffic without the governance layer seeing it. Sometimes this is malicious. More often it is a developer who finds the proxy slow, chatty, or in their way, and routes around it to ship a feature.

The impersonator wants to replace the real governance layer with a look-alike. The goal can be bypassing policy, forging audit entries, or reading prompts in plaintext. The attacker can be local malware, a compromised dependency, or a network adversary with TLS-interception capability.

A bearer-token-over-TLS design, the default for most API gateways, is weak against both. Tokens leak. Tokens get replayed. Proxies have no way to prove to clients that they are the real one. The fix is mutual authentication anchored in identity material that cannot be copied off the device.

Why mutual TLS

mTLS is decades old. What matters in this context is where identity is bound, not the protocol itself.

In normal TLS, the server presents a certificate and the client trusts it. In mTLS, the client also presents a certificate, and the server decides whether to trust that one. Both halves of the conversation are cryptographically authenticated.

For a governance proxy, three properties matter:

Clients can prove they are authorized agents on enrolled devices, not "anything that knows the token."
The proxy can prove it is the real policy engine, with no silent man-in-the-middle by a local look-alike.
There is no shared secret in the pipeline that can be lifted and reused.

Certificates can be revoked, rotated on short cycles, and scoped tightly. Tokens, in practice, get none of these right.

Binding to the device, not the user

A subtle but important design choice: the first-generation of AI proxies tend to authenticate users, not devices. This is the wrong anchor. A phished user credential should not allow an attacker to send prompts through the governance proxy from an unenrolled machine.

We bind to the device through a platform attestation chain:

macOS. The Secure Enclave generates a device key at enrollment, signed by Apple's hardware attestation and countersigned by our enrollment CA.
Windows. The TPM produces an AIK bound to the physical chip, wrapped by our enrollment certificate.
Linux. TPM 2.0 where available, with a hardware-backed keystore fallback on systems without one.

The client certificate presented during the mTLS handshake is derived from these device-bound keys. A working certificate cannot be moved off the device because the private key cannot leave the secure element. This is what zero-trust means in practice: the authorization question is not "is this a user we recognize," it is "is this the agent we installed, on the device we enrolled, running under the process identity we expect." All three have to match or the handshake fails closed.

The handshake

Here is what happens on every outbound AI request under this design:

The local agent intercepts the call before it reaches the upstream host. A loopback listener owned by the agent receives the request.
The agent resolves the socket back to a PID using OS-native APIs (the macOS Endpoint Security framework, Windows ETW, Linux audit and eBPF) and produces a process identity: binary hash, parent chain, executing user.
The agent selects a client certificate from the secure element. The subject encodes device ID, agent version, and tenant.
TLS 1.3 handshake to the policy engine. ECDHE for forward secrecy. No downgrade path.
The policy engine validates the client certificate against the tenant's root and checks revocation against a short-TTL CRL cache.
The policy engine presents its own certificate, pinned on the client side. If the fingerprint does not match what the agent was installed with, the connection drops before any payload is sent.
Inside the encrypted channel, the agent sends a signed metadata envelope wrapping the payload: process identity, SSO assertion, a nonce. Policy operates on this envelope.
The policy engine evaluates, classifies, sanitizes, forwards to the upstream model, and logs the full transaction with cryptographic lineage.

Everything through step seven takes under 30ms in steady state on modern hardware. The dominant cost in the round trip is policy evaluation, not crypto. This matters: developers abandon any tool that adds perceptible latency to the inner loop, and an abandoned governance layer is worse than no governance layer because it creates the illusion of one.

Rotation and revocation

The long-term failure mode of an mTLS system is operational, not cryptographic. When rotation is painful, operators push certificate lifetimes to a year. A year is long enough for any single mistake to compound into a real compromise window.

The design targets:

Agent certificates rotate every 24 hours, automatically, without user visibility. The enrollment key is long-lived and secure-element-bound; the certificate it issues is short-lived.
Policy engine certificates rotate every 30 days, with a 48-hour overlap window so in-flight connections never break.
CRLs have a 5-minute TTL, so a revoked device loses access fast enough to matter during an active incident.
Enrollment can be revoked remotely via the admin plane. Revocation cascades to every outstanding certificate issued under it.

The net effect: a stolen laptop stops being usable within five minutes of the revocation being issued. No long-lived secret sits in the pipeline that a successful compromise would let an attacker keep using.

What this design buys you

In concrete terms, after all of the above:

An attacker with a phished user credential cannot send prompts through the proxy from an unenrolled machine.
A local attacker cannot spin up a fake proxy and read traffic, because the client pin check happens before any payload is transmitted.
A malicious or compromised binary on an enrolled device can try to send traffic, but its process identity is captured honestly in the audit log, and policy can drop the request before it leaves.
Audit log entries are tamper-evident because each is signed by a key that never leaves the policy engine.

None of this requires the user to know mTLS exists. They install an agent, SSO once, and their workflows continue. The cryptography runs underneath.

What this design does not solve

A fair criticism: the enrollment step is the weakest link. If the initial attestation chain is broken, for example by a vendor shipping compromised firmware, a device can appear legitimate when it is not. We mitigate by pinning to specific hardware manufacturers we have verified, but this is not zero-risk and we have not claimed otherwise.

The other honest caveat is agentless operation. Some security teams want governance with no device presence. The honest answer is that you cannot do this well. You can do it poorly, by terminating TLS at an egress gateway and trusting the network, but that is a different product with materially worse guarantees. Any vendor offering full-policy AI governance with zero device agents is either redefining the words or hand-waving the gap. The test case is a local Ollama process. Network-only designs miss it.

More posts in this series will walk through process attribution and the policy engine itself. If specific parts of the handshake are worth going deeper on, say which ones and we will prioritize that writeup.