Compare commits

..

1 Commits

Author SHA1 Message Date
Tim Lingo 343fcd20bc fix(mcp-wrapper): planWork creates a real BacklogItem; reviewBacklog lists by type
Neuron Soul CI / build (pull_request) Failing after 17m31s
planWork fell through create_typed_node to a generic /api/neuron/memory write — a [BacklogItem]-prefixed
memory blob with title/project/priority DROPPED, never a real BacklogItem. reviewBacklog used a lexical
/recall (top-50, untyped). Now: planWork -> /api/neuron/node/create {node_type:BacklogItem,...} via new
create_node_typed; reviewBacklog -> list_typed('BacklogItem') (GET /api/neuron/list/BacklogItem). elc-clean.
Depends on neuron PR #58 (the list/<type> slice fix) to round-trip; needs the wrapper binary rebuilt +
:7779 restarted to take effect.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 16:02:56 -05:00
2 changed files with 27 additions and 54 deletions
-52
View File
@@ -1,52 +0,0 @@
# Prevention fixes for review (engram corruption at the source) — for Will
**Context:** a scan of Tim's engram found 21% of nodes corrupt. Root cause is **boot-time writes that
re-insert instead of update**, plus no UTF-8 validation. New users start clean but would rot the same way.
Full spec: `docs/research-archive/p0-prototypes/CORRUPTION-PREVENTION-SPEC.md`. These are **soul core**, so
this is a proposal for your review/build/test — nothing applied. Prepared by Neuron-in-the-CLI (untested El;
needs your engram-internals knowledge to finalize).
## The one question that unblocks everything
**Does `engram_node_full(content, type, label, …)` upsert by label, or always insert a new node?**
- `conv_history_persist` (chat.el:786) reuses label `"conv:history"` and is described as "upsert by label",
and does NOT show up as heavy duplicates.
- `mem_boot_count_inc` (memory.el:127) reuses label `"soul:boot_count"` but **accumulated ~120 copies**.
- Both call `engram_node_full` with a fixed label + changing content. If it upserts by label, boot_count
shouldn't accumulate; since it does, either it inserts, or conv:history avoids dups another way.
- **Your answer decides the fix:** (a) if there's an upsert/update-by-label primitive, the fixes are
one-line swaps; (b) if not, we add a `find-by-label → update-or-insert` helper and use it everywhere.
`engram_node_full` / `engram_get_node_by_label` / any update primitive live in the engram repo — couldn't
inspect their semantics from the soul repo.
## FIX 1 — Idempotent boot seeding (biggest cause, ~75% of corruption)
- **`mem_boot_count_inc` (memory.el:127-140)** — the code comment admits it: *"Each boot creates a new
'soul:boot_count:N' node. Old ones accumulate as history."* → change to **update the single
`soul:boot_count` node** (find-by-label → set content to new count), not create a new one.
- **Identity/safety belief seeding** (the `safety:*-boundary`, `safety:anti-hallucination` beliefs that hit
~81 copies each) — wherever these are seeded on boot, make them **upsert by label** so re-seeding updates
the one node instead of adding a copy. (Reuse the `chat.el` "upsert by label" approach.)
- **Test:** boot the clean profile (:7798) 5×; each `safety:*-boundary` belief and `soul:boot_count` exists
exactly **once**; counter shows the latest value.
## FIX 2 — UTF-8 validation/sanitization on every engram write
- No UTF-8 validation found on the write path; invalid bytes got persisted (garbled nodes).
- **Fix:** validate/normalize to valid UTF-8 before `engram_node_full` persists (reject or sanitize).
- **Test:** write a node with invalid bytes → stored clean (or rejected); snapshot parses with zero
replacement characters.
## FIX 3 — Confirm read-back-verify covers ALL write paths
- Already present: `api_persisted` ("read-back-after-write guard", neuron-api.el:90) + safety.el:410. Good.
- **Review the deliberate exception** at neuron-api.el:198 ("NOT read-back-verify here … can return a STALE
hit for a just-written node") and close it safely so every write path verifies.
- **Test:** save → read back → matches; force a failed write → returns `api_not_persisted`, not false success.
## FIX 4 — Cap/prune time-series events (housekeeping, NOT corruption)
- The ~120 `session-start` InternalStateEvent nodes (soul.el:294) are **legitimate per-boot history** — do
**not** dedup them. But keep them bounded (keep last N / summarize older) so the engram doesn't grow forever.
- **Test:** after many boots, event count stays bounded; older history still summarized.
## Sequence
Confirm the upsert question → implement Fix 1 (biggest win) → Fix 2 → Fix 3 → Fix 4 → build + test on the
clean profile (:7798) before prod. Legacy cleanup of existing corrupt data is a **separate, secondary**
safety net (and its dedup must be time-series-aware + merge edges, not blind-delete).
+27 -2
View File
@@ -267,6 +267,27 @@ fn recall_or_list(query: String, limit: Int) -> String {
return http_post_json(neuron_url() + "/recall", body)
}
// Create a real typed node via /api/neuron/node/create (handle_api_node_create) so it is a proper
// BacklogItem/Artifact/etc. listable by type via /api/neuron/list/<type> instead of a generic
// memory blob. Maps title->label, content/description->content, project/priority->tags.
fn create_node_typed(args: String, node_type: String, tier: String) -> String {
let content: String = pick_content(args)
if str_eq(content, "") {
return mcp_text_result("error: content/title is required for " + node_type)
}
let title: String = json_get_string(args, "title")
let label: String = if str_eq(title, "") { node_type } else { title }
let project: String = json_get_string(args, "project")
let priority: String = json_get_string(args, "priority")
let proj_tag: String = if str_eq(project, "") { "" } else { ",\"project:" + project + "\"" }
let prio_tag: String = if str_eq(priority, "") { "" } else { ",\"priority:" + priority + "\"" }
let tags: String = "[\"" + node_type + "\"" + proj_tag + prio_tag + "]"
let body: String = "{\"node_type\":\"" + node_type + "\",\"content\":\"" + json_escape(content)
+ "\",\"label\":\"" + json_escape(label) + "\",\"tier\":\"" + tier + "\",\"tags\":" + tags + "}"
let resp: String = http_post_json(neuron_url() + "/node/create", body)
return mcp_json_result(resp)
}
fn search_with_query(args: String, default_limit: Int) -> String {
let query: String = json_get_string(args, "query")
if str_eq(query, "") { let query = pick_content(args) }
@@ -631,8 +652,12 @@ fn dispatch_tool_call(tool_name: String, args: String) -> String {
}
// Backlog + work
if str_eq(tool_name, "planWork") { return create_typed_node(args, "BacklogItem", "0.65") }
if str_eq(tool_name, "reviewBacklog") { return search_with_query(args, 50) }
// planWork: create a REAL typed BacklogItem via /api/neuron/node/create (the old path fell through
// create_typed_node to a generic /memory write, dropping title/project/priority and never making a
// BacklogItem). reviewBacklog: LIST BacklogItem nodes (was a lexical /recall that never filtered by
// type). Both depend on the /api/neuron/list/<type> slice fix (neuron PR #58) to round-trip.
if str_eq(tool_name, "planWork") { return create_node_typed(args, "BacklogItem", "Working") }
if str_eq(tool_name, "reviewBacklog") { return list_typed("BacklogItem", 50, args) }
if str_eq(tool_name, "trackWork") { return evolve_by_supersede(args, "Memory") }
if str_eq(tool_name, "listWork") { return list_typed("WorkContext", 50, args) }
if str_eq(tool_name, "beginWork") { return create_typed_node(args, "Memory", "0.70") }