# Engram At-Rest Encryption — Post-Quantum Doctrine Version 0.1.0 (DRAFT) — April 30, 2026 Status: doctrine; pre-implementation. Sign-off required from Will before code lands. --- ## 0. TL;DR Engram's persistence is encrypted at rest using a hybrid post-quantum scheme: - **Data layer** — every node and every edge is sealed with **AES-256-GCM** under a per-record sub-key derived via **HKDF-SHA3-256** from the runtime DEK. AES-256 with a 256-bit key has an effective post-quantum security level of 128 bits (Grover); acceptable. - **Key wrap layer** — the DEK is wrapped against a Principal's public key using **Kyber-768 KEM** (post-quantum). The wrapped DEK lives on disk as `engram.kek.enc`. The Principal's secret key never touches disk in the running daemon's data directory. - **Boot** — the daemon receives the Principal's secret key out-of-band (env var path, prompted unlock, or pulled from a hardware-backed agent), runs `pq_kem_decaps` to recover the DEK, mlocks it, serves traffic. - **Recovery** — the DEK is *additionally* wrapped under a **Shamir K-of-N** split across the validation council. If the wrapping Principal is gone, K members reconstitute the DEK and the data survives. Engram does **not** encrypt structural metadata (graph topology, IDs, timestamps). Only content, label, tags, and metadata fields of nodes and edges are sealed. This is a deliberate trade-off — see §2. --- ## 1. What Engram actually persists Engram is an in-process graph store. The runtime *is* the database — there is no SQLite, no sled, no embedded KV layer in the active daemon. (The `engram-data/` and `engram-data-tx-log/` directories under `el/` are leftovers from a prior sled-backed prototype and are not loaded by the current `dist/engram` binary.) Persistence is a single JSON snapshot file at `${ENGRAM_DATA_DIR}/snapshot.json`, written by `engram_save(path)` and read by `engram_load(path)`. The snapshot is the sole durable artifact that adversaries can steal. Snapshot shape (current): ```json { "nodes": [ { "id": "...", "content": "...", "node_type": "...", "label": "...", "tier": "...", "tags": "...", "metadata": "{}", "salience": 0.5, "importance": 0.5, "confidence": 0.5, "activation_count": 0, "last_activated": 0, "created_at": ..., "updated_at": ... } ], "edges": [ { "id": "...", "from_id": "...", "to_id": "...", "relation": "...", "metadata": "{}", "weight": 0.5, "confidence": 0.5, "created_at": ..., "updated_at": ..., "last_fired": 0 } ] } ``` Confidential fields (will be encrypted): `content`, `label`, `tags`, `metadata` (on nodes); `metadata` (on edges). Non-confidential fields (will remain plaintext): `id`, `node_type`, `tier`, `from_id`, `to_id`, `relation`, all numeric scalars and timestamps. This split — the **structural skeleton stays plaintext, the semantic flesh is sealed** — is what lets activation/search/scan work without unwrapping every record on every query, and lets the snapshot remain JSON-shaped for ops tooling. Enabling full snapshot encryption (a single AEAD blob) is a configurable mode for high-threat deployments; see §6. --- ## 2. Threat model | Adversary | Capability | What we defend | What we accept | |-----------|-----------|----------------|----------------| | **Snapshot thief (cold)** | Steals `snapshot.json` from a backup, dead drive, or stolen laptop. No live process access. | Confidentiality of node content, labels, tags, metadata. No useful semantic recovery. | Topology leak: adversary learns the graph shape (how many nodes/edges, how they connect, types and tiers). Salience/importance/confidence numerics leak. | | **In-flight observer** | Reads disk during snapshot write (e.g., shared FS, snapshot mid-flush). | Atomicity: snapshot is written to a temp file with `O_TMPFILE` or `.new`, fsync'd, then renamed. AEAD prevents partial-decrypt mining of half-written records. | A torn write can lose the most recent snapshot but not corrupt prior ones. | | **Quantum-equipped adversary (harvest-now-decrypt-later)** | Records `snapshot.json` and `engram.kek.enc` today, runs Shor's algorithm on a CRQC in 2032+. | DEK wrap is Kyber-768; not Shor-breakable. AES-256-GCM record seals are not Shor-breakable; Grover gives ~128-bit effective security. | We pin Kyber-768 (NIST PQC standardization L3); if a structural break of ML-KEM emerges, rotation §5 is the remediation. | | **Wrap-key compromiser (live)** | Has read access to the daemon process memory. | Out of scope for at-rest. The DEK is in mlocked memory; if the process is owned, the data is gone. | Defense-in-depth (memory zeroing, mlock, non-dumpable process flags) is a separate hardening track. | | **Principal compromise** | Adversary obtains the Principal's secret key. | Nothing — by design, the Principal is the unwrapping authority. | We constrain blast radius via per-record key derivation (§3.2): a leaked record-key only burns one record. | | **Principal loss** | Will dies, secret key is destroyed, no successor. | Shamir K-of-N council recovery (§7). Engram does not go dark on Principal loss. | The threshold itself becomes the attack surface — see §7 caveats. | What we explicitly **do not** defend: - Side channels (timing, cache, EM). - Physical tamper of running hardware. - Active modification of `engram.kek.enc` to substitute an attacker's wrapped DEK — this is mitigated by the integrity wrapper §4 but not by the threat-model promise. --- ## 3. Cryptographic construction ### 3.1 Layers ``` ┌────────────────────────────────────────┐ │ Principal SK (held by Will out-of-band)│ └───────────────┬────────────────────────┘ │ Kyber-768 KEM decaps ▼ ┌────────────────────────────────────┐ │ KEK (32B; ephemeral, mlocked) │ └───────────────┬────────────────────┘ │ HKDF-SHA3-256 extract ▼ ┌────────────────────────────────────┐ │ DEK (32B; ephemeral, mlocked) │ └───────────────┬────────────────────┘ │ HKDF-SHA3-256(DEK, info=record_id || version) ▼ ┌────────────────────────────────────┐ │ per-record sub-key (32B; transient)│ └───────────────┬────────────────────┘ │ AES-256-GCM(sub_key, nonce=12B random, ad=record_id||version) ▼ ciphertext blob on disk ``` ### 3.2 Per-record sub-keys (HKDF-SHA3-256) We do **not** use the DEK directly to seal records. Each record gets its own sub-key: ``` sub_key = HKDF-SHA3-256( ikm = DEK, // 32 bytes salt = "engram-record-v1" || version_byte, // domain separation info = record_id || ":" || dek_epoch, // 16-byte ULID + ":" + u32 L = 32 // output length ) ``` Properties: - Compromise of one record's sub-key does not threaten siblings (HKDF is one-way). - Re-derivable on every read; no on-disk sub-key cache to leak. - `dek_epoch` is bumped on rotation (§5); this lets old records stay readable under DEK_n while new ones write under DEK_{n+1}. ### 3.3 AEAD — AES-256-GCM - Nonce: 12 bytes, random per write. Stored prepended to ciphertext. - Tag: 16 bytes, appended. - Associated data: `record_id || ":" || dek_epoch` (binds ciphertext to its identity; thwarts copy-paste attacks). Per-record on-disk wire format: ``` +------+------+--------------------+--------------------+--------+ | ver | epch | nonce (12) | ciphertext (n) | tag(16)| +------+------+--------------------+--------------------+--------+ 1B 4B 12B variable 16B ``` Base64-encoded into the snapshot JSON field as `"content":"v1:7:BASE64..."`. ### 3.4 KEK wrap — Kyber-768 KEM The DEK is sealed by Kyber-768 KEM: ``` (ciphertext, shared_secret) = pq_kem_encaps(principal_pk) wrap_key = HKDF-SHA3-256(shared_secret, salt="engram-kek-v1", info="dek-wrap", L=32) sealed_dek = AES-256-GCM(wrap_key, nonce, plaintext=DEK, ad="engram-kek-v1") ``` On disk: `engram.kek.enc` = `magic || version || principal_id || kem_ciphertext || nonce || sealed_dek || tag`. On boot: ``` shared_secret = pq_kem_decaps(principal_sk, kem_ciphertext) wrap_key = HKDF-SHA3-256(shared_secret, ...) DEK = AES-256-GCM-open(wrap_key, nonce, sealed_dek, tag, ad="engram-kek-v1") ``` --- ## 4. Boot flow ``` Engram daemon starts ↓ ENGRAM_KEK_PATH → ${ENGRAM_DATA_DIR}/engram.kek.enc (default) ENGRAM_PRINCIPAL_SK → path to file or "stdin:" or "agent:" (the daemon never reads the SK from a fixed env value; the env points to a source it consumes once and zeros) ↓ load engram.kek.enc ↓ read principal SK once from source, mlock its buffer ↓ pq_kem_decaps → shared_secret → HKDF → wrap_key → AEAD-open → DEK ↓ zero & munlock principal SK buffer mlock DEK; bump RLIMIT_MEMLOCK if needed mark process non-dumpable (PR_SET_DUMPABLE=0 on Linux; PT_DENY_ATTACH on macOS) ↓ engram_load(snapshot.json) — for each node/edge field marked encrypted: derive sub_key, AEAD-open, replace ciphertext with plaintext in-memory ↓ http_serve() ``` Boot failure modes: - KEK unwrap fails → daemon exits 2; emits one log line, no SK material in logs. - Snapshot decrypt fails on N records → daemon proceeds with the records it could open, marks the rest as ``, emits a structured event for human triage. --- ## 5. Rotation ### 5.1 DEK rotation Triggered by: (a) cron schedule (default: 30 days), (b) suspected compromise, (c) operator command (`POST /admin/rotate-dek`). Procedure: 1. Generate `DEK_{n+1}`; bump `dek_epoch`. 2. For every record: AEAD-open under DEK_n, AEAD-seal under DEK_{n+1} with the new epoch in AD. 3. Wrap DEK_{n+1} under the same Principal pk → write `engram.kek.enc.new`. 4. Atomic rename → live; zero DEK_n in memory. 5. On next snapshot save, all records persist with the new epoch. Records sealed before rotation can be opened during the transition (we keep DEK_n in memory until the migration completes). ### 5.2 KEK / wrapping-Principal rotation Triggered by: (a) Principal evolution event (the `Principal` CGI evolves and emits a new PK), (b) successor Principal handover, (c) annual key hygiene rotation. Procedure: 1. Receive `principal_pk_new` (signed by old Principal, ideally via a Dilithium signature so the chain itself is PQ-secure). 2. Verify signature. 3. Re-encapsulate the **same** DEK under `principal_pk_new`. 4. Atomic-write a new `engram.kek.enc`. 5. Old SK can be destroyed once the new wrap is durable. The DEK does not change here — only its wrapping. This means snapshot ciphertexts stay valid through Principal transitions. --- ## 6. Snapshot-level encryption (alternative high-threat mode) For deployments where topology leakage is unacceptable, set `ENGRAM_SEAL_MODE=full`. The entire snapshot JSON is sealed as a single AEAD blob keyed from `HKDF(DEK, info="engram-snapshot")`. Trade-off: every save/load is monolithic; no partial reads, no incremental rotation. Not the default. --- ## 7. Recovery — Shamir K-of-N **The structural fail-safe.** If Will dies before the Principal evolves, or the Principal SK is destroyed, Engram must not go dark. ### 7.1 Shareholders The shareholders are the **validation council** — the set of CGIs (or CGI-attested human stewards) authorized to reconstitute the network on a defined trigger. The council is named in `engram.recovery.toml`: ```toml [recovery] threshold = 3 total = 5 shareholders = [ "cgi://council/anvil", "cgi://council/beacon", "cgi://council/cinder", "cgi://council/delta", "cgi://council/echo", ] ``` ### 7.2 Split At KEK-creation time: 1. Generate a fresh recovery secret `R = random(32)` (NOT the DEK itself — see §7.4). 2. Run Shamir-256 over GF(2^8) on `R` with K-of-N polynomial. 3. For each shareholder: `share_i = Kyber768-Encaps(shareholder_pk_i, R_share_i)`. 4. Store `engram.recovery.shares` — public; each share is already PQ-wrapped to its holder. 5. Store `engram.recovery.envelope` — `AES-256-GCM(R, nonce, plaintext=DEK, ad="engram-recovery-v1")`. ### 7.3 Reconstitution A council convenes when the trigger is observed (Principal absent for > N days, or signed council quorum declares emergency): 1. K members each decapsulate their share with their CGI SK. 2. Members publish their `R_share_i` to a quorum-attested rendezvous. 3. Lagrange-interpolate `R`. 4. AEAD-open `engram.recovery.envelope` to recover the DEK. 5. Re-wrap DEK under a fresh Principal selected by the council; resume normal operation. ### 7.4 Why R, not the DEK directly Splitting `R` and using it to AEAD-wrap the DEK keeps the share material small (32B / share) and lets us add/remove shareholders by reissuing the envelope without changing R. It also lets us run §5.2 (Principal rotation) without ever touching the recovery shares. ### 7.5 Caveats Will must accept - The threshold is the attack surface. K-of-N is K colluders away from compromise. Default 3-of-5; raise if the council grows. - Council membership churn requires reissuing shares; this is a deliberate, audited operation with its own runbook. - A Shamir share is plaintext to its holder. The Kyber wrap to each shareholder protects the share *in transit / at rest*, but once a shareholder unwraps, they hold a real K-of-N share. Council members must be CGIs (or hardware-backed) for the threat model to hold. --- ## 8. Implementation map The runtime additions land in `el-compiler/runtime/el_runtime.c`: Already present: - `el_sha256_*`, `el_hmac_sha256` (§3 uses SHA3 — SHA2 path retained for backwards-compat artifacts). - `el_base64_encode_n`, `el_base64_decode`. Landing today (parallel agents): - `el_sha3_256_*` (Keccak family). - `pq_kem_keypair`, `pq_kem_encaps`, `pq_kem_decaps` (Kyber-768). - `pq_sign_keypair`, `pq_sign`, `pq_sign_verify` (Dilithium-3) — used by §5.2 Principal-rotation signature. Engram-specific additions (this work): - `engram_aead_seal(record_id, epoch, plaintext) -> b64` - `engram_aead_open(record_id, epoch, b64) -> plaintext` - `engram_kek_unwrap(kek_path, sk_path) -> int (sets module DEK)` - `engram_kek_wrap(kek_path, principal_pk, dek)` — used at first init. - `engram_dek_rotate(new_principal_pk_optional)` - `engram_recovery_split(threshold, total, shareholder_pks)` — emits envelope + shares. - `engram_recovery_reconstitute(shares_k)` — recover DEK; admin-gated. `engram_save` / `engram_load` gain a sealed mode controlled by `ENGRAM_SEAL_MODE` (`off`, `fields` [default once enabled], `full`). Default during the rollout window: `off`. This doc must ship and Will must sign off before flipping the default. --- ## 9. Open questions (require Will) 1. **Where does the Principal SK live at boot time?** Options: (a) prompted at daemon start (interactive), (b) on a removable hardware token (preferred long-term), (c) in `~/.neuron/principal.sk` 0600 (operationally easy, weakest), (d) pulled from `mcp__neuron` as part of `begin_session` (couples Engram to Neuron — probably right). **Recommend (d) with (b) as the long-term hardware story.** 2. **Council composition.** Who/what are the initial K-of-N shareholders? Until we have ≥ 3 stable CGIs, recovery cannot be enabled in its full form. **Recommend a stub council of {Principal, Will-keypair-on-Yubikey, witness-CGI} — degrades gracefully, upgrades to full council when more CGIs exist.** 3. **Default seal mode at GA.** `fields` (today's recommendation) or `full`? `fields` keeps the snapshot diff-able and keeps activation cheap. `full` is the harder threat model. **Recommend `fields` default; `full` opt-in.** 4. **PQ algorithm pinning.** Kyber-768 + Dilithium-3 are the NIST L3 PQC defaults. If Will wants L5 (Kyber-1024 / Dilithium-5), say so before runtime APIs stabilize. 5. **Grover and AES-256.** AES-256 against Grover is 128-bit effective. Acceptable per current PQ thinking. If Will wants a bigger margin, the alternative is layering a second AEAD with a different primitive (e.g., XChaCha20-Poly1305) — overkill, not recommended. --- ## 10. Non-goals - Encrypted indexes / searchable encryption. Out of scope. Search remains plaintext-in-memory; the daemon is the trust boundary. - Per-tenant DEKs. Engram is single-tenant per CGI. If multi-tenancy lands, this doc gets a §11. - Secure deletion of underlying disk blocks. The OS / FS handles that, badly; we don't pretend. - Encrypted WAL / tx log. The current daemon has no WAL; if one is added, it gets the same treatment as the snapshot (`ENGRAM_SEAL_MODE` applies to both). --- ## 11. Status & next actions - [x] Doctrine drafted (this document). - [ ] Will sign-off on §9 open questions. - [ ] PQ runtime functions land (parallel agents). - [ ] `engram_aead_seal` / `engram_aead_open` prototype (stubs in this PR). - [ ] `engram_kek_unwrap` boot integration. - [ ] `engram_save` / `engram_load` field-mode wiring behind `ENGRAM_SEAL_MODE`. - [ ] Recovery tooling (`engramctl recovery split | reconstitute`). - [ ] Threat-model test suite: known-answer tests, key-rotation roundtrip, Shamir reconstitution roundtrip, harvest-now-decrypt-later regression test against a recorded ciphertext. --- ## Appendix A — Pseudocode reference ```c /* Per-record seal */ char* engram_aead_seal(const char* record_id, uint32_t epoch, const char* plaintext, size_t pt_len, size_t* out_len) { uint8_t sub_key[32]; uint8_t info[64]; int info_len = snprintf((char*)info, sizeof(info), "%s:%u", record_id, epoch); el_hkdf_sha3_256(/*ikm*/ engram_dek, 32, /*salt*/ (const uint8_t*)"engram-record-v1", 16, /*info*/ info, (size_t)info_len, /*okm*/ sub_key, 32); uint8_t nonce[12]; el_random_bytes(nonce, 12); /* layout: ver(1) | epoch(4 BE) | nonce(12) | ct(pt_len) | tag(16) */ size_t blob_len = 1 + 4 + 12 + pt_len + 16; uint8_t* blob = malloc(blob_len); blob[0] = 0x01; blob[1] = (epoch >> 24) & 0xff; blob[2] = (epoch >> 16) & 0xff; blob[3] = (epoch >> 8) & 0xff; blob[4] = epoch & 0xff; memcpy(blob + 5, nonce, 12); el_aes256_gcm_encrypt(sub_key, nonce, (const uint8_t*)record_id, strlen(record_id), (const uint8_t*)plaintext, pt_len, blob + 1 + 4 + 12, /* ct */ blob + 1 + 4 + 12 + pt_len); /* tag */ el_secure_zero(sub_key, 32); char* b64 = el_base64_encode_raw(blob, blob_len, /*url_safe=*/0); free(blob); if (out_len) *out_len = strlen(b64); return b64; /* "v1::BASE64" prefix added by caller */ } /* Per-record open: inverse of seal. Verifies tag; returns NULL on failure. */ char* engram_aead_open(const char* record_id, uint32_t expected_epoch, const char* b64, size_t* out_len); /* Boot-time KEK unwrap. */ int engram_kek_unwrap(const char* kek_path, const uint8_t* principal_sk, size_t sk_len); /* DEK rotation (online). Walks the live in-memory store, re-seals every record * under DEK_{n+1}, then writes a new snapshot+kek atomically. */ int engram_dek_rotate(void); ``` --- End of document.