Atomic engram_save + anti-clobber floor (validated, darwin build recipe) #57
Reference in New Issue
Block a user
Delete Branch "fix/engram-save-atomic-darwin"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Preserves + makes reviewable the validated engram-clobber fix. Base is the
live-darwin-runtime snapshot (the un-versioned source the live :7770 soul is built
from), so the diff is ONLY the engram_save change + a darwin build script.
The fix (engram_save)
on POSIX). No reader ever sees a truncated/0-byte snapshot — that empty-window race
was the root of the genesis -> nodes=1 -> 63-node-clobber loop.
A partial load can never clobber a healthy graph, whatever the upstream cause.
Validation (in isolation, nothing live touched)
no .tmp leftover, sub-200KB engrams unaffected).
isolated port against a golden copy: loaded 5113 nodes, round-tripped the full 47MB
snapshot, no .tmp leftover, live ~/.neuron untouched.
scripts/build-soul-darwin.sh
Replicates elb on macOS/arm64 with clang (elb ships Linux-only via CI). Lets us build
and test the darwin soul locally. Key: -Wno-implicit-function-declaration (dist modules
use C89 implicit cross-module decls Apple clang rejects), and link *.o once.
NOT this PR
Landing this on main is a separate reconciliation (the live runtime diverges from main —
it lacks main's engram_wm_*/engram_load_merge/http_serve_async). Do not blind-merge.
Opened by Neuron on Tim's machine; validated, not yet deployed to live :7770.
Kills the engram-clobber loop at its source. engram_save did a bare fopen("wb") that truncates snapshot.json to 0 bytes before the 47MB write — a booting soul's engram_load could read that empty window -> genesis -> nodes=1 -> a 63-node save overwrote the populated file. Two guards: 1. Atomic write: serialize to <path>.tmp, fflush+fsync, rename() over target (atomic on POSIX) — no reader ever sees a truncated/0-byte snapshot. 2. Sparse-write floor: refuse to overwrite a >200KB snapshot with one < 1/16 its size — a partial load can never clobber a healthy graph, whatever the cause. Validated in isolation: standalone clang harness 11/11; rebuilt the darwin soul (scripts/build-soul-darwin.sh) and booted it on an isolated port against a golden copy — loaded 5113 nodes and round-tripped the full 47MB snapshot, no .tmp leftover, live ~/.neuron untouched. Adds scripts/build-soul-darwin.sh (local elb replacement). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>Pull request closed