Fix engram_node_full field corruption + add validation #52
Reference in New Issue
Block a user
Delete Branch "fix/engram-node-full-field-corruption"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Stale wrapper signature didn't match the C primitive; el_val_t is untyped so args were coerced + forwarded by position (tier got ints, fields got junk — ~100 corrupt nodes). Corrects the wrapper 1:1 and adds node_type/tier allowlist validation (fail-loud, no silent malformed writes). Full write-up + deploy runbook + data-prune proposal in the neuron repo: HANDOFF-engram-write-corruption.md.
Code Review — PR #52: Fix engram_node_full field corruption + add validation
Reviewed by: automated review (Claude Code)
Files changed: 1 (
lang/runtime/engram.el)Diff: +42 / -3
Summary
This PR fixes a real, production-impacting bug: the El wrapper for
__engram_node_fullhad a stale 8-argument signature that didn't match the C primitive. Becauseel_val_tis an untyped machine word, the EL compiler accepted the mismatched call and forwarded arguments by position into the C function — placing floats in string slots, an integer in the tier slot, and so on. The result was ~100 corrupt graph nodes. The fix is correct: the new wrapper signature matchesel_seed.hexactly (content, node_type, label, salience, importance, confidence, tier, tags), and the added validation guards are a genuinely useful defense-in-depth layer. This is a high-priority fix that should land quickly.What was done well
lang/runtime/engram.ellines 9–27 — Validation helpersThe allowlist approach is exactly right for an untyped runtime. Because
el_val_tcan't encode type failures at compile time, the only reliable safeguard is a runtime value check. Listing every legalnode_typeandtierstring explicitly (rather than, say, a prefix check) is correct — it makes the contract visible in code and forces a deliberate update when new types are added.lang/runtime/engram.ellines 38–56 — Correctedengram_node_fullsignatureThe old signature:
The new signature:
This matches the infra.c stub signature exactly (
content, node_type, label, salience, importance, confidence, tier, tags) which I verified inlang/tests/suite/infra.c:214-220. The 1:1 alignment eliminates the coercion-by-position bug completely.The multi-line comment block at lines 38–47 is excellent. It documents exactly why the old signature was dangerous (untyped machine word, positional forwarding), what the old bug looked like field-by-field, and what the contract is. This level of explicitness is the right call for a footgun of this kind.
engram_nodealso gets validation (lines 33–37) — good catch that the simpler wrapper was equally unprotected.Failure is loud, not silent. Returning
""and printing[engram] REJECTED ...is the right pattern here: callers already checkif str_eq(node_id, "") { return err_json(...) }(seeui/vessels/el-graph/src/editor.el:33-34), so the empty-string sentinel propagates correctly up the stack without swallowing the error.The only active call site (
editor.el:32) is already compatible with the corrected signature.engram_node_full(content, eff_type, eff_label, sal, sal, int_to_float(1), "Working", "")maps exactly to the new parameter positions. The defaulteff_type = "Entity"is in the allowlist, and"Working"is in the tier allowlist. No callers are broken.Issues / Needs Work
[non-blocking, medium] No tests added for the new validators —
lang/runtime/engram.el:14-27engram_valid_node_typeandengram_valid_tierare pure functions operating on strings — they are trivially testable with the existingtest.elframework. The test suite inlang/tests/hastest_core.el,test_string.el, etc. but notest_engram.el. This matters because:"Wonder","Imprint"— the latter already exists in the live Neuron system per the handoff doc) needs a failing test to remind them to update this file."Imprint"is listed inengram_valid_node_typebut is not present in the tier list. If"Imprint"is also a valid tier value in practice, the tier list needs updating.Suggested test additions (using the existing harness pattern from
test.el):[non-blocking, low]
engram_valid_tierdoes not validate against empty string —lang/runtime/engram.el:21-24If a caller passes
tier = ""(e.g. from a JSON field that wasn't populated),engram_valid_tier("")returnsfalseand the write is rejected — which is technically correct behavior. However, there's no explicit empty-string check with a distinct rejection message. The current behavior conflates "wrong tier value" and "missing tier" in the log output, which will make debugging harder. Consider:[non-blocking, low]
engram_nodevalidator does not includelabelin its rejection log —lang/runtime/engram.el:33-37The
engram_nodewrapper only logsnode_type, butengram_node_fullcorrectly also logslabel. Minor inconsistency — not a bug sinceengram_nodehas no label argument, but worth verifying this message is still actionable for debugging.[non-blocking, informational] Caller-side allowlist divergence risk
editor.el:31hardcodes the default fallback"Entity"as a node type. This value is correctly in theengram_valid_node_typeallowlist today. But since the allowlist and the default are in two separate files, they can drift. When new node types are added to the allowlist, or defaults change in callers, there's no automated cross-check. A compile-time constant or shared manifest would be a stronger solution long-term — but this is out of scope for a hotfix PR.Build result
Cloned
fix/engram-node-full-field-corruptionsuccessfully into/tmp/el-pr52/. NoMakefilepresent at the root.emacsis not installed on this machine, so bytecode compilation was not attempted. The El compiler (lang/el-compiler/) is a self-hosted EL program and requires the EL runtime to run. Static review of the changed.elfile found no syntax anomalies: function declarations,ifguards, string concatenation, andreturnstatements are all syntactically consistent with the patterns used throughout the rest of the runtime.The C test infra stub in
lang/tests/suite/infra.c:214-220confirms the 8-arg signature is:(content, node_type, label, salience, importance, confidence, tier, tags)— exactly matching the new wrapper. The fix is structurally correct.Verdict
APPROVE (with requested follow-up)
The core bug fix is correct, the comment documentation is excellent, and the existing call site is already aligned with the new signature. The missing test coverage is the only real gap — it's not blocking for a hotfix but should be addressed in a follow-up before this area is touched again. The full handoff doc referenced in the PR description (
HANDOFF-engram-write-corruption.md) should also include a note thatengram_valid_node_typeandengram_valid_tiermust be kept in sync with any new node type additions to the C runtime.Reviewed — code looks good, CI verified locally. Approved to merge.
Re-approved after retargeting to dev (gitflow compliance). Ready to merge.