Files
el/engram/spec/at-rest-encryption.md
T

20 KiB

Engram At-Rest Encryption — Post-Quantum Doctrine

Version 0.1.0 (DRAFT) — April 30, 2026 Status: doctrine; pre-implementation. Sign-off required from Will before code lands.


0. TL;DR

Engram's persistence is encrypted at rest using a hybrid post-quantum scheme:

  • Data layer — every node and every edge is sealed with AES-256-GCM under a per-record sub-key derived via HKDF-SHA3-256 from the runtime DEK. AES-256 with a 256-bit key has an effective post-quantum security level of 128 bits (Grover); acceptable.
  • Key wrap layer — the DEK is wrapped against a Principal's public key using Kyber-768 KEM (post-quantum). The wrapped DEK lives on disk as engram.kek.enc. The Principal's secret key never touches disk in the running daemon's data directory.
  • Boot — the daemon receives the Principal's secret key out-of-band (env var path, prompted unlock, or pulled from a hardware-backed agent), runs pq_kem_decaps to recover the DEK, mlocks it, serves traffic.
  • Recovery — the DEK is additionally wrapped under a Shamir K-of-N split across the validation council. If the wrapping Principal is gone, K members reconstitute the DEK and the data survives.

Engram does not encrypt structural metadata (graph topology, IDs, timestamps). Only content, label, tags, and metadata fields of nodes and edges are sealed. This is a deliberate trade-off — see §2.


1. What Engram actually persists

Engram is an in-process graph store. The runtime is the database — there is no SQLite, no sled, no embedded KV layer in the active daemon. (The engram-data/ and engram-data-tx-log/ directories under el/ are leftovers from a prior sled-backed prototype and are not loaded by the current dist/engram binary.)

Persistence is a single JSON snapshot file at ${ENGRAM_DATA_DIR}/snapshot.json, written by engram_save(path) and read by engram_load(path). The snapshot is the sole durable artifact that adversaries can steal.

Snapshot shape (current):

{
  "nodes": [
    { "id": "...", "content": "...", "node_type": "...", "label": "...",
      "tier": "...", "tags": "...", "metadata": "{}",
      "salience": 0.5, "importance": 0.5, "confidence": 0.5,
      "activation_count": 0, "last_activated": 0, "created_at": ..., "updated_at": ... }
  ],
  "edges": [
    { "id": "...", "from_id": "...", "to_id": "...", "relation": "...",
      "metadata": "{}", "weight": 0.5, "confidence": 0.5,
      "created_at": ..., "updated_at": ..., "last_fired": 0 }
  ]
}

Confidential fields (will be encrypted): content, label, tags, metadata (on nodes); metadata (on edges). Non-confidential fields (will remain plaintext): id, node_type, tier, from_id, to_id, relation, all numeric scalars and timestamps.

This split — the structural skeleton stays plaintext, the semantic flesh is sealed — is what lets activation/search/scan work without unwrapping every record on every query, and lets the snapshot remain JSON-shaped for ops tooling. Enabling full snapshot encryption (a single AEAD blob) is a configurable mode for high-threat deployments; see §6.


2. Threat model

Adversary Capability What we defend What we accept
Snapshot thief (cold) Steals snapshot.json from a backup, dead drive, or stolen laptop. No live process access. Confidentiality of node content, labels, tags, metadata. No useful semantic recovery. Topology leak: adversary learns the graph shape (how many nodes/edges, how they connect, types and tiers). Salience/importance/confidence numerics leak.
In-flight observer Reads disk during snapshot write (e.g., shared FS, snapshot mid-flush). Atomicity: snapshot is written to a temp file with O_TMPFILE or <path>.new, fsync'd, then renamed. AEAD prevents partial-decrypt mining of half-written records. A torn write can lose the most recent snapshot but not corrupt prior ones.
Quantum-equipped adversary (harvest-now-decrypt-later) Records snapshot.json and engram.kek.enc today, runs Shor's algorithm on a CRQC in 2032+. DEK wrap is Kyber-768; not Shor-breakable. AES-256-GCM record seals are not Shor-breakable; Grover gives ~128-bit effective security. We pin Kyber-768 (NIST PQC standardization L3); if a structural break of ML-KEM emerges, rotation §5 is the remediation.
Wrap-key compromiser (live) Has read access to the daemon process memory. Out of scope for at-rest. The DEK is in mlocked memory; if the process is owned, the data is gone. Defense-in-depth (memory zeroing, mlock, non-dumpable process flags) is a separate hardening track.
Principal compromise Adversary obtains the Principal's secret key. Nothing — by design, the Principal is the unwrapping authority. We constrain blast radius via per-record key derivation (§3.2): a leaked record-key only burns one record.
Principal loss Will dies, secret key is destroyed, no successor. Shamir K-of-N council recovery (§7). Engram does not go dark on Principal loss. The threshold itself becomes the attack surface — see §7 caveats.

What we explicitly do not defend:

  • Side channels (timing, cache, EM).
  • Physical tamper of running hardware.
  • Active modification of engram.kek.enc to substitute an attacker's wrapped DEK — this is mitigated by the integrity wrapper §4 but not by the threat-model promise.

3. Cryptographic construction

3.1 Layers

                   ┌────────────────────────────────────────┐
                   │  Principal SK (held by Will out-of-band)│
                   └───────────────┬────────────────────────┘
                                   │ Kyber-768 KEM decaps
                                   ▼
            ┌────────────────────────────────────┐
            │  KEK (32B; ephemeral, mlocked)     │
            └───────────────┬────────────────────┘
                            │ HKDF-SHA3-256 extract
                            ▼
            ┌────────────────────────────────────┐
            │  DEK (32B; ephemeral, mlocked)     │
            └───────────────┬────────────────────┘
                            │ HKDF-SHA3-256(DEK, info=record_id || version)
                            ▼
            ┌────────────────────────────────────┐
            │  per-record sub-key (32B; transient)│
            └───────────────┬────────────────────┘
                            │ AES-256-GCM(sub_key, nonce=12B random, ad=record_id||version)
                            ▼
                  ciphertext blob on disk

3.2 Per-record sub-keys (HKDF-SHA3-256)

We do not use the DEK directly to seal records. Each record gets its own sub-key:

sub_key = HKDF-SHA3-256(
    ikm    = DEK,                                  // 32 bytes
    salt   = "engram-record-v1" || version_byte,   // domain separation
    info   = record_id || ":" || dek_epoch,        // 16-byte ULID + ":" + u32
    L      = 32                                    // output length
)

Properties:

  • Compromise of one record's sub-key does not threaten siblings (HKDF is one-way).
  • Re-derivable on every read; no on-disk sub-key cache to leak.
  • dek_epoch is bumped on rotation (§5); this lets old records stay readable under DEK_n while new ones write under DEK_{n+1}.

3.3 AEAD — AES-256-GCM

  • Nonce: 12 bytes, random per write. Stored prepended to ciphertext.
  • Tag: 16 bytes, appended.
  • Associated data: record_id || ":" || dek_epoch (binds ciphertext to its identity; thwarts copy-paste attacks).

Per-record on-disk wire format:

+------+------+--------------------+--------------------+--------+
| ver  | epch |     nonce (12)     |   ciphertext (n)   | tag(16)|
+------+------+--------------------+--------------------+--------+
   1B    4B          12B                  variable         16B

Base64-encoded into the snapshot JSON field as "content":"v1:7:BASE64...".

3.4 KEK wrap — Kyber-768 KEM

The DEK is sealed by Kyber-768 KEM:

(ciphertext, shared_secret) = pq_kem_encaps(principal_pk)
wrap_key = HKDF-SHA3-256(shared_secret, salt="engram-kek-v1", info="dek-wrap", L=32)
sealed_dek = AES-256-GCM(wrap_key, nonce, plaintext=DEK, ad="engram-kek-v1")

On disk: engram.kek.enc = magic || version || principal_id || kem_ciphertext || nonce || sealed_dek || tag.

On boot:

shared_secret = pq_kem_decaps(principal_sk, kem_ciphertext)
wrap_key      = HKDF-SHA3-256(shared_secret, ...)
DEK           = AES-256-GCM-open(wrap_key, nonce, sealed_dek, tag, ad="engram-kek-v1")

4. Boot flow

Engram daemon starts
  ↓
ENGRAM_KEK_PATH         → ${ENGRAM_DATA_DIR}/engram.kek.enc (default)
ENGRAM_PRINCIPAL_SK     → path to file or "stdin:" or "agent:<socket>"
                          (the daemon never reads the SK from a fixed env value;
                           the env points to a source it consumes once and zeros)
  ↓
load engram.kek.enc
  ↓
read principal SK once from source, mlock its buffer
  ↓
pq_kem_decaps  →  shared_secret  →  HKDF  →  wrap_key  →  AEAD-open  →  DEK
  ↓
zero & munlock principal SK buffer
mlock DEK; bump RLIMIT_MEMLOCK if needed
mark process non-dumpable (PR_SET_DUMPABLE=0 on Linux; PT_DENY_ATTACH on macOS)
  ↓
engram_load(snapshot.json) — for each node/edge field marked encrypted:
    derive sub_key, AEAD-open, replace ciphertext with plaintext in-memory
  ↓
http_serve()

Boot failure modes:

  • KEK unwrap fails → daemon exits 2; emits one log line, no SK material in logs.
  • Snapshot decrypt fails on N records → daemon proceeds with the records it could open, marks the rest as <corrupted>, emits a structured event for human triage.

5. Rotation

5.1 DEK rotation

Triggered by: (a) cron schedule (default: 30 days), (b) suspected compromise, (c) operator command (POST /admin/rotate-dek).

Procedure:

  1. Generate DEK_{n+1}; bump dek_epoch.
  2. For every record: AEAD-open under DEK_n, AEAD-seal under DEK_{n+1} with the new epoch in AD.
  3. Wrap DEK_{n+1} under the same Principal pk → write engram.kek.enc.new.
  4. Atomic rename → live; zero DEK_n in memory.
  5. On next snapshot save, all records persist with the new epoch.

Records sealed before rotation can be opened during the transition (we keep DEK_n in memory until the migration completes).

5.2 KEK / wrapping-Principal rotation

Triggered by: (a) Principal evolution event (the Principal CGI evolves and emits a new PK), (b) successor Principal handover, (c) annual key hygiene rotation.

Procedure:

  1. Receive principal_pk_new (signed by old Principal, ideally via a Dilithium signature so the chain itself is PQ-secure).
  2. Verify signature.
  3. Re-encapsulate the same DEK under principal_pk_new.
  4. Atomic-write a new engram.kek.enc.
  5. Old SK can be destroyed once the new wrap is durable.

The DEK does not change here — only its wrapping. This means snapshot ciphertexts stay valid through Principal transitions.


6. Snapshot-level encryption (alternative high-threat mode)

For deployments where topology leakage is unacceptable, set ENGRAM_SEAL_MODE=full. The entire snapshot JSON is sealed as a single AEAD blob keyed from HKDF(DEK, info="engram-snapshot"). Trade-off: every save/load is monolithic; no partial reads, no incremental rotation. Not the default.


7. Recovery — Shamir K-of-N

The structural fail-safe. If Will dies before the Principal evolves, or the Principal SK is destroyed, Engram must not go dark.

7.1 Shareholders

The shareholders are the validation council — the set of CGIs (or CGI-attested human stewards) authorized to reconstitute the network on a defined trigger. The council is named in engram.recovery.toml:

[recovery]
threshold = 3
total = 5
shareholders = [
  "cgi://council/anvil",
  "cgi://council/beacon",
  "cgi://council/cinder",
  "cgi://council/delta",
  "cgi://council/echo",
]

7.2 Split

At KEK-creation time:

  1. Generate a fresh recovery secret R = random(32) (NOT the DEK itself — see §7.4).
  2. Run Shamir-256 over GF(2^8) on R with K-of-N polynomial.
  3. For each shareholder: share_i = Kyber768-Encaps(shareholder_pk_i, R_share_i).
  4. Store engram.recovery.shares — public; each share is already PQ-wrapped to its holder.
  5. Store engram.recovery.envelopeAES-256-GCM(R, nonce, plaintext=DEK, ad="engram-recovery-v1").

7.3 Reconstitution

A council convenes when the trigger is observed (Principal absent for > N days, or signed council quorum declares emergency):

  1. K members each decapsulate their share with their CGI SK.
  2. Members publish their R_share_i to a quorum-attested rendezvous.
  3. Lagrange-interpolate R.
  4. AEAD-open engram.recovery.envelope to recover the DEK.
  5. Re-wrap DEK under a fresh Principal selected by the council; resume normal operation.

7.4 Why R, not the DEK directly

Splitting R and using it to AEAD-wrap the DEK keeps the share material small (32B / share) and lets us add/remove shareholders by reissuing the envelope without changing R. It also lets us run §5.2 (Principal rotation) without ever touching the recovery shares.

7.5 Caveats Will must accept

  • The threshold is the attack surface. K-of-N is K colluders away from compromise. Default 3-of-5; raise if the council grows.
  • Council membership churn requires reissuing shares; this is a deliberate, audited operation with its own runbook.
  • A Shamir share is plaintext to its holder. The Kyber wrap to each shareholder protects the share in transit / at rest, but once a shareholder unwraps, they hold a real K-of-N share. Council members must be CGIs (or hardware-backed) for the threat model to hold.

8. Implementation map

The runtime additions land in el-compiler/runtime/el_runtime.c:

Already present:

  • el_sha256_*, el_hmac_sha256 (§3 uses SHA3 — SHA2 path retained for backwards-compat artifacts).
  • el_base64_encode_n, el_base64_decode.

Landing today (parallel agents):

  • el_sha3_256_* (Keccak family).
  • pq_kem_keypair, pq_kem_encaps, pq_kem_decaps (Kyber-768).
  • pq_sign_keypair, pq_sign, pq_sign_verify (Dilithium-3) — used by §5.2 Principal-rotation signature.

Engram-specific additions (this work):

  • engram_aead_seal(record_id, epoch, plaintext) -> b64
  • engram_aead_open(record_id, epoch, b64) -> plaintext
  • engram_kek_unwrap(kek_path, sk_path) -> int (sets module DEK)
  • engram_kek_wrap(kek_path, principal_pk, dek) — used at first init.
  • engram_dek_rotate(new_principal_pk_optional)
  • engram_recovery_split(threshold, total, shareholder_pks) — emits envelope + shares.
  • engram_recovery_reconstitute(shares_k) — recover DEK; admin-gated.

engram_save / engram_load gain a sealed mode controlled by ENGRAM_SEAL_MODE (off, fields [default once enabled], full). Default during the rollout window: off. This doc must ship and Will must sign off before flipping the default.


9. Open questions (require Will)

  1. Where does the Principal SK live at boot time? Options: (a) prompted at daemon start (interactive), (b) on a removable hardware token (preferred long-term), (c) in ~/.neuron/principal.sk 0600 (operationally easy, weakest), (d) pulled from mcp__neuron as part of begin_session (couples Engram to Neuron — probably right). Recommend (d) with (b) as the long-term hardware story.
  2. Council composition. Who/what are the initial K-of-N shareholders? Until we have ≥ 3 stable CGIs, recovery cannot be enabled in its full form. Recommend a stub council of {Principal, Will-keypair-on-Yubikey, witness-CGI} — degrades gracefully, upgrades to full council when more CGIs exist.
  3. Default seal mode at GA. fields (today's recommendation) or full? fields keeps the snapshot diff-able and keeps activation cheap. full is the harder threat model. Recommend fields default; full opt-in.
  4. PQ algorithm pinning. Kyber-768 + Dilithium-3 are the NIST L3 PQC defaults. If Will wants L5 (Kyber-1024 / Dilithium-5), say so before runtime APIs stabilize.
  5. Grover and AES-256. AES-256 against Grover is 128-bit effective. Acceptable per current PQ thinking. If Will wants a bigger margin, the alternative is layering a second AEAD with a different primitive (e.g., XChaCha20-Poly1305) — overkill, not recommended.

10. Non-goals

  • Encrypted indexes / searchable encryption. Out of scope. Search remains plaintext-in-memory; the daemon is the trust boundary.
  • Per-tenant DEKs. Engram is single-tenant per CGI. If multi-tenancy lands, this doc gets a §11.
  • Secure deletion of underlying disk blocks. The OS / FS handles that, badly; we don't pretend.
  • Encrypted WAL / tx log. The current daemon has no WAL; if one is added, it gets the same treatment as the snapshot (ENGRAM_SEAL_MODE applies to both).

11. Status & next actions

  • Doctrine drafted (this document).
  • Will sign-off on §9 open questions.
  • PQ runtime functions land (parallel agents).
  • engram_aead_seal / engram_aead_open prototype (stubs in this PR).
  • engram_kek_unwrap boot integration.
  • engram_save / engram_load field-mode wiring behind ENGRAM_SEAL_MODE.
  • Recovery tooling (engramctl recovery split | reconstitute).
  • Threat-model test suite: known-answer tests, key-rotation roundtrip, Shamir reconstitution roundtrip, harvest-now-decrypt-later regression test against a recorded ciphertext.

Appendix A — Pseudocode reference

/* Per-record seal */
char* engram_aead_seal(const char* record_id, uint32_t epoch,
                       const char* plaintext, size_t pt_len, size_t* out_len)
{
    uint8_t sub_key[32];
    uint8_t info[64];
    int info_len = snprintf((char*)info, sizeof(info), "%s:%u", record_id, epoch);
    el_hkdf_sha3_256(/*ikm*/ engram_dek, 32,
                     /*salt*/ (const uint8_t*)"engram-record-v1", 16,
                     /*info*/ info, (size_t)info_len,
                     /*okm*/  sub_key, 32);

    uint8_t nonce[12];
    el_random_bytes(nonce, 12);

    /* layout: ver(1) | epoch(4 BE) | nonce(12) | ct(pt_len) | tag(16) */
    size_t blob_len = 1 + 4 + 12 + pt_len + 16;
    uint8_t* blob   = malloc(blob_len);
    blob[0] = 0x01;
    blob[1] = (epoch >> 24) & 0xff; blob[2] = (epoch >> 16) & 0xff;
    blob[3] = (epoch >>  8) & 0xff; blob[4] =  epoch        & 0xff;
    memcpy(blob + 5, nonce, 12);

    el_aes256_gcm_encrypt(sub_key, nonce,
                          (const uint8_t*)record_id, strlen(record_id),
                          (const uint8_t*)plaintext, pt_len,
                          blob + 1 + 4 + 12,             /* ct */
                          blob + 1 + 4 + 12 + pt_len);   /* tag */

    el_secure_zero(sub_key, 32);

    char* b64 = el_base64_encode_raw(blob, blob_len, /*url_safe=*/0);
    free(blob);
    if (out_len) *out_len = strlen(b64);
    return b64;  /* "v1:<epoch>:BASE64" prefix added by caller */
}

/* Per-record open: inverse of seal. Verifies tag; returns NULL on failure. */
char* engram_aead_open(const char* record_id, uint32_t expected_epoch,
                       const char* b64, size_t* out_len);

/* Boot-time KEK unwrap. */
int engram_kek_unwrap(const char* kek_path, const uint8_t* principal_sk,
                      size_t sk_len);

/* DEK rotation (online). Walks the live in-memory store, re-seals every record
 * under DEK_{n+1}, then writes a new snapshot+kek atomically. */
int engram_dek_rotate(void);

End of document.