Signal Protocol (X3DH + Double Ratchet) between the user's device and the inference endpoint. The database holds ciphertext only. Nothing is logged. Nothing is retained. Nothing is trainable.
That sentence is load-bearing. Healthcare triage, legal intake, therapy notes, financial advice, incident response — every prompt hops CDN → load balancer → API gateway → inference server, with logging, caching, and retention at each step. In March 2023 OpenAI disclosed a bug that exposed snippets of other users' chat titles and payment data. The problem isn't malice; it's that plaintext exists in too many hands.
Two modes. Vault Mode is a one-shot encrypted prompt (simpler, no session). Agent Mode establishes a full Double Ratchet session — forward secrecy on every turn, break-in recovery after partial compromise.
┌──────────────┐ ciphertext only ┌──────────────────┐
│ User device │ ──────────────────────────▶ │ Zoza AI Relay │
│ │ │ (zoza-products) │
│ X3DH + DR │ ◀────────────────────────── │ ▸ DB: ciphertext │
└──────────────┘ │ ▸ no retention │
└──────────────────┘
│
│ forwards blob
▼
┌──────────────────┐
│ Inference pod │
│ ▸ decrypts in │
│ RAM │
│ ▸ runs inference │
│ ▸ re-encrypts │
│ ▸ discards │
│ │
│ → TEE-sealed Q3 │
│ (Nitro/SEV-SNP)│
└──────────────────┘
Agent Mode (Double Ratchet) Vault Mode (one-shot)
─────────────────────────── ─────────────────────
• user.X3DH(model.id, model.spk) • user generates ephemeral key
• derive session root • ECDH(ephemeral, model.id) → K
• ratchet key per message • AES-GCM(K, prompt)
• forward-secrecy on turn N • model ephemeral key in reply
• break-in recovery on turn N+1 • no session state on relay
Today (v0.1): the model's Curve25519 identity keypair is generated in the inference pod's process memory. It never touches disk and is regenerated on pod restart, but a Zoza operator with shell access to the running process could dump memory.
Q3 target: keypair generated inside an AWS Nitro Enclave or AMD SEV-SNP, sealed to the TEE measurement. A different build can't unseal; a Zoza operator has no shell into the enclave. This closes the process-memory window.
The relay database holds only EncryptedMessage rows — ciphertext + ratchet headers + session IDs + timestamps. Full DB dump reveals no plaintext. Backups, snapshots, and log lines are the same.
The one remaining exposure is the live inference pod's process memory during the ~200ms a response is being generated. TEE sealing (Q3) closes that.
Honest assessment — every row based on the provider's published privacy policy as of 2026-04. "Encrypted in transit" is table stakes; the question is who holds the plaintext after decryption.
| Provider | Sees plaintext? | Retention | Training on prompts | E2E option |
|---|---|---|---|---|
| OpenAI (API) | Yes | 30 days | Opt-out by default (Enterprise) | No |
| OpenAI (ChatGPT) | Yes | Indefinite (unless delete) | Yes (opt-out in settings) | No |
| Anthropic | Yes | 30 days API, 2 years safety-flagged | No by default | No |
| Google Gemini | Yes | 18 months default | Yes (opt-out limited) | No |
| Local LLM (Ollama) | User only | User-controlled | None | N/A (local) |
| OpaquePrompts (research) | Partial | N/A | N/A | PII-redaction only |
| Zoza AI Agent | TEE only | Zero plaintext stored | Impossible at protocol layer | Default |
Caveat: this table is about protocol design, not trust. Zoza's claim is only as strong as (a) TEE attestation is intact and (b) the sealed-key build is reproducible. See the "What's NOT yet built" section for the concrete gaps.
Each of these is a documented class, with a dollar figure or case number where public. Zoza AI Agent closes the protocol-level class; application-layer bugs (prompt injection, output filtering) are orthogonal.
OpenAI ChatGPT bug, March 2023. Redis client bug exposed snippets of other users' chat titles and payment metadata for ~9 hours. Ref: OpenAI "March 20 ChatGPT outage" postmortem. Zoza AI Agent mitigation: relay only stores ciphertext keyed to per-session ratchet state; a misrouted response still decrypts only under the correct session key.
Carlini et al., 2021 & 2023. Adversarial prompts caused GPT-2 / GPT-3 to emit verbatim training data including phone numbers, addresses, and code. Any prompt that enters the training pipeline can resurface to a later user. Mitigation: prompts that never exist as plaintext outside the TEE cannot enter a training set.
Apple iCloud vs. law-enforcement precedent. Providers get hundreds of thousands of requests per year. A provider that holds plaintext must respond. A provider that holds only ciphertext cannot respond meaningfully. Mitigation: Zoza warrant canary at /about/ai-canary.html (once published); government requests we cannot serve return "relay blob only — plaintext lives on user device / in TEE."
Meta & Facebook "tasks" incident (2023, reported by 404 Media). Internal tool gave ~600 engineers access to user messages without audit. Analogous read-everything access is the default at every major AI provider. Mitigation: in Zoza Agent Mode the provider's insider sees ciphertext on the wire; only the TEE's sealed key decrypts, and the TEE emits an attestation signature that the user verifies before sending. Rogue binaries fail attestation.
To be explicit: E2E encryption does not prevent the user (or an upstream system) from sending malicious prompts. Jailbreaks, indirect prompt injection from web content, and tool-call abuse live above Zoza. We solve confidentiality in transit and at rest; alignment and safety filters still run inside the TEE after decryption.
If malware steals the user's ratchet state, past messages encrypted with retired keys are safe (forward secrecy), and future messages auto-heal after the next DH ratchet step (break-in recovery). Compromise window is bounded to the messages between theft and the next ratchet — in Agent Mode, one turn.
SGX, Nitro, and SEV have known side-channel and rollback attacks (Foreshadow, CrossTalk, AEPIC). We do not claim TEEs are infallible. We claim (a) TEE raises the attacker cost from "read the log file" to "chain a hardware side-channel", and (b) when a TEE class is broken, we rotate to a newer attested image and publicly disclose via the canary.
E2E protects against technical adversaries, not against a person with physical access to the user's unlocked device. We document this limitation in the threat model. Panic-wipe / duress-PIN is on the roadmap — see "What's NOT yet built."
Every "novel protocol" works in tests. Here's how Zoza AI Agent behaves in the seven conditions that actually matter.
Ratchet messages are order-independent within a DH epoch. A dropped message just skips that chain-key — no session teardown, no re-handshake. Tested against 30% packet loss in the messenger's mobile client for 18 months.
X3DH + Double Ratchet costs ~3 ms per message on ARMv8. AES-GCM is hardware-accelerated on every phone shipped since 2016. Total crypto overhead is negligible compared to the inference RTT.
Sees: ciphertext, ratchet headers, session IDs, timestamps. Cannot see: plaintext, keys, the user's identity key. Cannot forge a message: would need the model's DH key. Cannot downgrade TEE: attestation verifier runs client-side.
Crypto cannot defeat a rubber-hose attacker. We document this in the threat model and ship duress-PIN + panic-wipe in the client (planned; see gap section). Message-level metadata (who talked to what model, when) is visible at the relay — traffic analysis is a remaining threat.
Model's identity key rotates on a schedule (weekly, configurable). Old-key ciphertext remains undecryptable by the new binding but readable by anyone with the leaked key — which is why we publish a compromised-key notice in the canary within 72h of detection.
Relay is a BAA-eligible pass-through processor; it never holds PHI or PII in plaintext. Right-to-delete is trivial: users revoke their device key and no historical ciphertext is decryptable. GDPR data-portability is model-export of the session history, decrypted on the user's device.
Signal Protocol assumes both endpoints are human phones. Zoza AI Agent's design target is to bind one endpoint to a reproducible TEE image: the ratchet partner becomes "the build whose SHA-384 measurement is in the attestation," not "some server that claims to be ModelX". That's the shift. This is the shipping-Q3 architecture — not what's running today. Today's relay binds to a Zoza-operated process; TEE sealing is the gap the next six months close.
The client verifies the TEE attestation before X3DH. The SPK is signed by the attestation quote, not by a Zoza-held key. A Zoza root key compromise does not let us MITM sessions — the attestation binding is to the hardware root-of-trust (AMD SEV-SNP IDBLOCK, Intel PCS, AWS Nitro).
The model binary is built reproducibly from a pinned Dockerfile + Nix flake; the build hash is the sealing measurement. Anyone can reproduce the build offline and confirm the TEE is running the advertised code. Planned: upload build-provenance attestations to Sigstore.
If we stopped here and charged enterprise prices, you'd be right to be skeptical. Shipping v1 means: the protocol is real, the Go backend is tested (21 tests passing), the apply-for-access flow works. The list below is what separates pilot-ready from production-ready; each item has a concrete timeline.
Read carefully: the protocol math is correct, tested, and matches Signal's published design. Tamarin's secrecy_SK lemma is verified by a real prover run (see products/zoza-ai/tamarin/PROOF_RUN_v0.2.txt). What's operationally shipped today is the protocol + zero-retention relay — meaningful vs. OpenAI / Anthropic's 30-day retention, but not the full "cryptographically impossible to read" story. The TEE-sealed inference that closes the process-memory window is the Q3 2026 target, not today's state. For pilots who want a strictly better posture than current LLM providers, Zoza is ready today. For pilots that require the TEE guarantee before onboarding, wait for Q3.
Shield has these four; AI Agent inherits the pattern. Only the ones actually implemented are linked; the rest are honestly listed as planned.
Hash-chained, Ed25519-signed admin actions.
planned Q3
Monthly signed statement: no undisclosed requests.
planned after 1st pilot
Ciphertext-only. No plaintext. 30d metadata.
live
Immunefi scope coming. Protocol bugs up to $25k.
scope drafted
Matching Signal's protocol is the floor, not the ceiling. This section is what separates "E2E-encrypted AI" (a category we define) from "E2E-encrypted messenger with an AI strapped on."
Any protocol-level or TEE-attestation bug rated Critical on CVSS 9.0+ gets a public disclosure within 72 hours of fix — regardless of customer contract. Binding, written, published at /about/ai-disclosure.html on first-customer day.
Customers can call /admin/export to receive the signed-prekey history + session-key archive for their registered models. If we shut down, you reconstruct the channel elsewhere. No vendor lock. Spec lives at zoza-ai/EXIT.md.
Measured on Intel i7-1165G7, Go 1.25, 2026-04-17: X3DH full handshake 0.52 ms; Double Ratchet encrypt 1.68 µs (152 MB/s); decrypt 1.08 µs (237 MB/s); full user↔model turn 0.50 ms. See products/zoza-ai/bench_test.go — run go test -bench=. to reproduce. Republished monthly with raw data + hardware.
Formal protocol document at products/zoza-ai/WHITEPAPER.md: handshake transcript, KDF labels, AD construction, threat model, security proof sketch. Reviewable diff via git; cryptographers welcome.
X3DH handshake Tamarin model + ProVerif of forward-secrecy properties, scheduled post-audit. Matches the bar set by Signal's academic papers (Cohn-Gordon et al. 2017).
Model sidecar built with Nix flake + pinned Docker base. Build hash == TEE measurement == Sigstore provenance. Anyone can reproduce offline and verify the SHA-384 quote matches the published build.
Requests received (lawful access + civil), complied, rejected, bounties paid, canary dates. Raw JSON published, signed by the Zoza transparency key. First report: Q3 2026, after first paying pilot.
OpenAI publishes none of these. Anthropic publishes partial (safety testing, RSP). No AI provider publishes all seven. Each is a concrete commitment — no "best effort," no "on roadmap," no "reach out to sales."
If you're building a healthcare, legal, therapy, or financial AI surface and you need "we cannot read your prompts" to be technically true, the developer apply form is below. If you think we got the threat model wrong, the bug-bounty scope doc is the place to tell us — we'll read every word.