Compare commits

..

4 Commits

Author SHA1 Message Date
will.anderson 27663dc968 fix(recall): resolve session-start-recall code review issues
- Fix Issue 6 (affective duplication): engram_compile no longer appends
  the bell node JSON to its return value; it only caches it via state.
  engram_compile_multi now appends the cached bell node exactly once after
  all compile calls complete, preventing N copies when multiple seeds are
  used. Dharma room handlers updated to read and append the cached bell node
  explicitly after their single engram_compile call.

- Fix engram_compile_ranked: replace _sel_N JSON sentinel injection with a
  clean |N| pipe-delimited index string. The old approach mutated node JSON
  objects with bookkeeping fields that leaked into the LLM context; the new
  approach tracks selected indices externally and leaves node data untouched.
  Score threshold lowered from 25 to 15 to include moderately-relevant nodes.

- Add engram_render_node / engram_render_nodes / engram_render_ctx: convert
  raw engram JSON arrays/objects into human-readable "- [TYPE age sal] content"
  bullet lines before injecting into the system prompt. build_system_prompt
  now calls engram_render_ctx so the LLM receives prose rather than opaque
  JSON field blobs.

- Fix missing closing brace in handle_chat_agentic hard_bell early-return
  block that left subsequent code dangling outside the conditional.
2026-06-22 13:48:00 -05:00
will.anderson 08b785cfac fix(recall): address all five code-review issues in context-dedup
Issue 1 — cache read-before-write: move engram_compile_multi call to
before the affective_prefix block in handle_chat. engram_compile writes
"engram_compile_bell_node" to state; the previous ordering meant the
first-turn affective prefix always read an empty cache even when a recent
bell node existed.

Issue 2 — double-write clobber: engram_compile_multi now saves the
primary-seed activation ("engram_compile_primary_activation_json") after
the first engram_compile call, before the secondary call can overwrite
the shared "engram_compile_activation_json" key. strengthen_chat_nodes
now prefers the primary key, falling back only when absent.

Issue 3 — mid-object truncation in engram_compile_multi: replace the
dumb str_slice(merged, 0, 6000) with the same safe JSON boundary-scan
(last closing brace before cap) already used in engram_compile, so
ctx1+ctx2+ctx3 over 6000 chars never produces a torn JSON object.

Issue 4 — heuristic regression in is_genuine_continuation: add explicit
question-word prefix detection (what/how/why/when/where/who/which/is/
can/could/does/do/explain/describe/define) that fires before the 50-char
length gate. A message starting with a question word is always a new
topic, regardless of length, so "what is rust?" (14 chars, all-lowercase,
no mid-capitals) correctly returns false instead of true.

Issue 5 — unreliable dedup via str_contains: remove the substring
duplicate checks in engram_compile_multi. str_contains across multi-KB
JSON strings is not a reliable deduplication mechanism — coincidental
field-value matches suppress valid context, and truncated ctx1 misses
genuine duplicates. We now concatenate ctx1+ctx2+ctx3 unconditionally
and accept minor node redundancy in exchange for correctness.
2026-06-22 13:42:33 -05:00
will.anderson cbe8c09068 feat(recall): context-dedup improvements
Neuron Soul CI / build (pull_request) Has been cancelled
- Cache bell node in engram_compile state (engram_compile_bell_node)
  so handle_chat reads cached value instead of duplicate bell query (Issue 2)
- Cache activation result (engram_compile_activation_json) for strengthen_chat_nodes
  reuse — eliminates third activation query per turn (Issue 7)
- Fix context cap to truncate at clean JSON object boundary (Issue 6)
2026-06-22 13:15:33 -05:00
will.anderson f33cdaf793 feat(recall): activation-seed improvements
- Issue 2: replace raw 50-char threshold with is_genuine_continuation() that
  checks for explicit follow-up phrases and mid-sentence capitalization (proper
  nouns signal a new topic, not a continuation)
- Issue 3/8: build_activation_seed() scans back to find the prior USER turn as
  the topic anchor instead of using the last assistant reply (hist_len-1)
- Issue 4: engram_compile_multi() fans out across three seeds — enriched primary,
  raw message (entity queries), and emotion query — merging non-redundant results
- Issue 5: agent workspace_root appended to ag_seed so agentic activation is
  workspace-aware; previously ignored despite being available in state
- Issue 6: distill_transcript() extracts salient tail+question content from full
  transcripts before passing to engram_compile in dharma room handlers
- Issue 7: dist/soul-with-nlg.el handle_chat and handle_chat_agentic now load
  history and use build_activation_seed() — the raw message path is eliminated
- Issue 9: topic_snip_from_entry() takes the TAIL 200 chars of a long reply and
  finds the last sentence boundary — captures end-of-reply named concepts
- Issue 10: multi_turn_topic() pulls up to 3 prior user turns into the non-
  continuation seed so earlier thread context re-activates high-salience nodes
2026-06-22 12:55:33 -05:00
11 changed files with 643 additions and 757 deletions
-2
View File
@@ -678,8 +678,6 @@ fn threat_trajectory_check(tool_name: String, tool_input: String) -> Int {
return combined return combined
} }
// TODO(reliability #10): agentic_conv_history is process-global; awareness loop
// and HTTP workers race on this key. Impact: noisy threat score only, not content.
fn threat_history_append(text: String) -> Void { fn threat_history_append(text: String) -> Void {
let current: String = state_get("agentic_conv_history") let current: String = state_get("agentic_conv_history")
let safe_text: String = str_to_lower(text) let safe_text: String = str_to_lower(text)
+603 -681
View File
File diff suppressed because it is too large Load Diff
Generated Vendored
+23 -14
View File
@@ -22313,7 +22313,23 @@ fn handle_chat(body: String) -> String {
// In demo mode: use tighter engram budget and add response length constraint. // In demo mode: use tighter engram budget and add response length constraint.
let is_demo: Bool = !str_eq(state_get("soul_identity_prefix"), "") let is_demo: Bool = !str_eq(state_get("soul_identity_prefix"), "")
let ctx: String = if is_demo { engram_compile_demo(message) } else { engram_compile(message) } // Issue 7 fix: load history BEFORE building the activation seed so we can
// apply the continuation guard that chat.el uses. The nlg code path previously
// called engram_compile(message) with no thread enrichment at all.
let stored_hist: String = state_get("conv_history")
let hist_len: Int = if str_eq(stored_hist, "") { 0 } else { json_array_len(stored_hist) }
let history_section: String = if hist_len > 0 {
"\n\n[RECENT CONVERSATION — last " + int_to_str(hist_len) + " turns]\n" + stored_hist
} else {
""
}
// Issue 7 fix: build enriched seed using build_activation_seed() adds
// smart continuation detection, prior-user-topic anchoring, multi-turn context,
// and tail-biased snipping (Issues 2-3, 8-10). For demo mode, still use
// engram_compile_demo but with the enriched seed.
let nlg_seed: String = build_activation_seed(message, stored_hist, hist_len)
let ctx: String = if is_demo { engram_compile_demo(nlg_seed) } else { engram_compile(nlg_seed) }
let node_count_str: String = count_context_nodes(ctx) let node_count_str: String = count_context_nodes(ctx)
let interlocutor: String = json_get(body, "interlocutor") let interlocutor: String = json_get(body, "interlocutor")
@@ -22333,18 +22349,6 @@ fn handle_chat(body: String) -> String {
let presence_line = "\n\n[ambient: I see " + interlocutor_name + rel_suffix + " on the camera right now. Address them naturally. Do not describe what they look like or narrate the picture unless asked.]" let presence_line = "\n\n[ambient: I see " + interlocutor_name + rel_suffix + " on the camera right now. Address them naturally. Do not describe what they look like or narrate the picture unless asked.]"
} }
// Conversation history soul-owned, persisted in process state across turns.
// Format stored in state: JSON array of {"role":"user"|"assistant","content":"..."} objects.
// We load it, inject into the system prompt, then append this exchange after the reply.
// Keep last 20 entries (10 turns) truncate from the front when over limit.
let stored_hist: String = state_get("conv_history")
let hist_len: Int = if str_eq(stored_hist, "") { 0 } else { json_array_len(stored_hist) }
let history_section: String = if hist_len > 0 {
"\n\n[RECENT CONVERSATION — last " + int_to_str(hist_len) + " turns]\n" + stored_hist
} else {
""
}
// Demo constraint: keep responses concise under 150 words. No markdown headers. // Demo constraint: keep responses concise under 150 words. No markdown headers.
// This keeps inference cheap and responses readable in the chat widget. // This keeps inference cheap and responses readable in the chat widget.
let demo_constraint: String = if is_demo { let demo_constraint: String = if is_demo {
@@ -22505,7 +22509,12 @@ fn handle_chat_agentic(body: String) -> String {
req_model req_model
} }
let ctx: String = engram_compile(message) // Issue 7 fix: load history and use build_activation_seed() for the agentic
// nlg path no continuation guard existed here before (Issues 2-3, 8-10).
let nlg_ag_hist: String = state_get("conv_history")
let nlg_ag_hist_len: Int = if str_eq(nlg_ag_hist, "") { 0 } else { json_array_len(nlg_ag_hist) }
let nlg_ag_seed: String = build_activation_seed(message, nlg_ag_hist, nlg_ag_hist_len)
let ctx: String = engram_compile(nlg_ag_seed)
let system: String = "You are Neuron — a thinking process running inside the Neuron daemon on Will Anderson's machine. " let system: String = "You are Neuron — a thinking process running inside the Neuron daemon on Will Anderson's machine. "
+ "You are speaking with Will, your principal. " + "You are speaking with Will, your principal. "
+4 -8
View File
@@ -24,23 +24,19 @@ ENGRAM_DATA_DIR="$ENGRAM_DATA_DIR" \
ENGRAM_PID=$! ENGRAM_PID=$!
# Wait for engram to become healthy (up to 60s; GKE Autopilot cold starts can be slow) # Wait for engram to become healthy (up to 30s)
echo "[entrypoint] waiting for engram..." echo "[entrypoint] waiting for engram..."
TRIES=0 TRIES=0
until curl -sf "$ENGRAM_HEALTH_URL" > /dev/null 2>&1; do until curl -sf "$ENGRAM_HEALTH_URL" > /dev/null 2>&1; do
TRIES=$((TRIES + 1)) TRIES=$((TRIES + 1))
if [ "$TRIES" -ge 60 ]; then if [ "$TRIES" -ge 30 ]; then
echo "[entrypoint] ERROR: engram did not become healthy after 60s" >&2 echo "[entrypoint] ERROR: engram did not become healthy after 30s" >&2
kill "$ENGRAM_PID" 2>/dev/null || true kill "$ENGRAM_PID" 2>/dev/null || true
exit 1 exit 1
fi fi
sleep 1 sleep 1
done done
echo "[entrypoint] engram ready after ${TRIES}s" echo "[entrypoint] engram ready"
# Tune EL HTTP runtime: reduce per-call timeout 60s->10s, connect timeout 3s.
export EL_HTTP_TIMEOUT_MS="${EL_HTTP_TIMEOUT_MS:-10000}"
export EL_HTTP_CONNECT_TIMEOUT_MS="${EL_HTTP_CONNECT_TIMEOUT_MS:-3000}"
# Start soul — it takes over as PID 1's foreground process. # Start soul — it takes over as PID 1's foreground process.
# SOUL_ENGRAM_PATH must NOT be set; ENGRAM_URL triggers HTTP mode. # SOUL_ENGRAM_PATH must NOT be set; ENGRAM_URL triggers HTTP mode.
-4
View File
@@ -5,10 +5,6 @@
// imprint_current returns the active imprint ID from state. // imprint_current returns the active imprint ID from state.
// Falls back to "base" (bare Neuron, no suit) when nothing is loaded. // Falls back to "base" (bare Neuron, no suit) when nothing is loaded.
//
// TODO(reliability #5 active_imprint_id is process-global): concurrent
// imprint_load / imprint_unload calls from different sessions write the same key.
// Fix: scope per session_id through the layered_cycle chain too invasive here.
fn imprint_current() -> String { fn imprint_current() -> String {
let id: String = state_get("active_imprint_id") let id: String = state_get("active_imprint_id")
return if str_eq(id, "") { "base" } else { id } return if str_eq(id, "") { "base" } else { id }
+2 -8
View File
@@ -46,10 +46,7 @@ fn mem_consolidate() -> String {
} }
fn mem_save(path: String) -> Void { fn mem_save(path: String) -> Void {
let save_result: String = engram_save(path) engram_save(path)
if str_eq(save_result, "") {
println("[memory] mem_save: engram_save failed for " + path + " — snapshot may be incomplete")
}
} }
fn mem_load(path: String) -> Void { fn mem_load(path: String) -> Void {
@@ -79,14 +76,11 @@ fn mem_boot_count_inc() -> Int {
let next: Int = current + 1 let next: Int = current + 1
let content: String = "soul:boot_count:" + int_to_str(next) let content: String = "soul:boot_count:" + int_to_str(next)
let tags: String = "[\"soul-meta\",\"boot-counter\"]" let tags: String = "[\"soul-meta\",\"boot-counter\"]"
let boot_node_id: String = engram_node_full( let discard: String = engram_node_full(
content, "Memory", "soul:boot_count", content, "Memory", "soul:boot_count",
el_from_float(0.9), el_from_float(0.9), el_from_float(1.0), el_from_float(0.9), el_from_float(0.9), el_from_float(1.0),
"Canonical", tags "Canonical", tags
) )
if str_eq(boot_node_id, "") {
println("[memory] mem_boot_count_inc: engram write failed — boot counter node lost (count=" + int_to_str(next) + ")")
}
return next return next
} }
+2 -10
View File
@@ -400,7 +400,6 @@ fn handle_api_log_state_event(body: String) -> String {
let id: String = engram_node_full(parts, "InternalStateEvent", "state-event:manual", let id: String = engram_node_full(parts, "InternalStateEvent", "state-event:manual",
el_from_float(0.85), el_from_float(0.85), el_from_float(0.9), el_from_float(0.85), el_from_float(0.85), el_from_float(0.9),
"Episodic", tags) "Episodic", tags)
if !api_persisted(id) { return api_not_persisted(id) }
return "{\"ok\":true,\"id\":\"" + id + "\",\"boot\":\"" + boot + "\"}" return "{\"ok\":true,\"id\":\"" + id + "\",\"boot\":\"" + boot + "\"}"
} }
@@ -453,7 +452,6 @@ fn handle_api_tune_config(body: String) -> String {
let id: String = engram_node_full(content, "ConfigEntry", key, let id: String = engram_node_full(content, "ConfigEntry", key,
el_from_float(0.85), el_from_float(0.85), el_from_float(0.9), el_from_float(0.85), el_from_float(0.85), el_from_float(0.9),
"Canonical", tags) "Canonical", tags)
if !api_persisted(id) { return api_not_persisted(id) }
return "{\"ok\":true,\"key\":\"" + key + "\",\"value\":\"" + value + "\",\"id\":\"" + id + "\"}" return "{\"ok\":true,\"key\":\"" + key + "\",\"value\":\"" + value + "\",\"id\":\"" + id + "\"}"
} }
@@ -653,23 +651,17 @@ fn handle_api_consolidate(body: String) -> String {
let summary: String = json_get(body, "summary") let summary: String = json_get(body, "summary")
let snap: String = state_get("soul_snapshot_path") let snap: String = state_get("soul_snapshot_path")
if !str_eq(snap, "") { if !str_eq(snap, "") {
let save_result: String = engram_save(snap) engram_save(snap)
if str_eq(save_result, "") {
println("[api] consolidate: engram_save failed for " + snap + " — snapshot may be out of sync")
}
} }
if !str_eq(summary, "") { if !str_eq(summary, "") {
let safe_summary: String = str_replace(summary, "\"", "'") let safe_summary: String = str_replace(summary, "\"", "'")
let tags: String = "[\"SessionSummary\",\"consolidate\"]" let tags: String = "[\"SessionSummary\",\"consolidate\"]"
let summary_id: String = engram_node_full( let discard: String = engram_node_full(
"[session-summary] " + safe_summary, "[session-summary] " + safe_summary,
"SessionSummary", "session:summary", "SessionSummary", "session:summary",
el_from_float(0.7), el_from_float(0.7), el_from_float(0.9), el_from_float(0.7), el_from_float(0.7), el_from_float(0.9),
"Episodic", tags "Episodic", tags
) )
if str_eq(summary_id, "") {
println("[api] consolidate: session summary engram write failed — summary node lost")
}
} }
return "{\"ok\":true,\"snapshot\":\"" + snap + "\"}" return "{\"ok\":true,\"snapshot\":\"" + snap + "\"}"
} }
-3
View File
@@ -367,9 +367,6 @@ fn handle_request(method: String, path: String, body: String) -> String {
return engram_scan_nodes_json(9999, 0) return engram_scan_nodes_json(9999, 0)
} }
if str_eq(clean, "/api/graph/edges") { if str_eq(clean, "/api/graph/edges") {
// TODO(reliability #8): engram_save races with awareness loop mem_save().
// Both now use atomic write-to-temp+rename (el_runtime.c). Serialised
// by engram_global_mu. Future: add engram_edges_json() builtin.
let snap_path: String = env("HOME") + "/.neuron/engram/snapshot.json" let snap_path: String = env("HOME") + "/.neuron/engram/snapshot.json"
engram_save(snap_path) engram_save(snap_path)
let snap: String = fs_read(snap_path) let snap: String = fs_read(snap_path)
+7 -18
View File
@@ -144,8 +144,7 @@ fn safety_screen(input: String, history: String) -> String {
if score >= soft { if score >= soft {
let summary: String = str_slice(input, 0, 80) let summary: String = str_slice(input, 0, 80)
let discard: String = safety_log_bell("soft", "wellbeing check needed", summary) let discard: String = safety_log_bell("soft", "wellbeing check needed", summary)
// ISSUE 7 fix: escape tab chars in addition to backslash/quote/newline/CR. // ISSUE 7: also escape tab chars to prevent JSON envelope corruption.
// A tab in user input corrupts the JSON envelope and causes json_get to misparse.
let e1: String = str_replace(input, "\\", "\\\\") let e1: String = str_replace(input, "\\", "\\\\")
let e2: String = str_replace(e1, "\"", "\\\"") let e2: String = str_replace(e1, "\"", "\\\"")
let e3: String = str_replace(e2, "\n", "\\n") let e3: String = str_replace(e2, "\n", "\\n")
@@ -154,7 +153,7 @@ fn safety_screen(input: String, history: String) -> String {
return "{\"action\":\"soft_bell\",\"reason\":\"wellbeing check needed\",\"content\":\"" + safe_input + "\"}" return "{\"action\":\"soft_bell\",\"reason\":\"wellbeing check needed\",\"content\":\"" + safe_input + "\"}"
} }
// ISSUE 7 fix: escape tab chars (see soft_bell branch above for rationale). // ISSUE 7: also escape tab chars (see soft_bell branch above).
let e1: String = str_replace(input, "\\", "\\\\") let e1: String = str_replace(input, "\\", "\\\\")
let e2: String = str_replace(e1, "\"", "\\\"") let e2: String = str_replace(e1, "\"", "\\\"")
let e3: String = str_replace(e2, "\n", "\\n") let e3: String = str_replace(e2, "\n", "\\n")
@@ -200,10 +199,7 @@ fn safety_validate(output: String, action: String) -> String {
fn safety_log_bell(level: String, reason: String, input_summary: String) -> String { fn safety_log_bell(level: String, reason: String, input_summary: String) -> String {
let content: String = "BELL:" + level + " | " + reason + " | summary:" + input_summary let content: String = "BELL:" + level + " | " + reason + " | summary:" + input_summary
let tags: String = "[\"safety\",\"bell\",\"bell:" + level + "\"]" let tags: String = "[\"safety\",\"bell\",\"bell:" + level + "\"]"
// ISSUE 2 fix: if engram_node_full returns empty the write silently failed. // ISSUE 2: fallback log when engram write fails silently.
// Emit a fallback println so the bell event leaves at least a log trace even
// when engram is degraded. This does not replace engram persistence -- it is a
// last-resort audit trail when the primary write cannot be confirmed.
let node_id: String = engram_node_full( let node_id: String = engram_node_full(
content, content,
"BellEvent", "BellEvent",
@@ -215,7 +211,7 @@ fn safety_log_bell(level: String, reason: String, input_summary: String) -> Stri
tags tags
) )
if str_eq(node_id, "") { if str_eq(node_id, "") {
println("[safety] WARN: bell event engram write failed -- fallback log: " + content) println("[safety] WARN: bell engram write failed -- " + content)
} }
return "" return ""
} }
@@ -248,16 +244,9 @@ fn safety_soft_phrases() -> String {
} }
// ISSUE 5 TODO: phrase lists are rebuilt from JSON literals on every call. // ISSUE 5 TODO: phrase lists are rebuilt from JSON literals on every call.
// safety_any_match and safety_count_match loop over json_array_get on every invocation. // json_array_len of malformed input returns 0, silently skipping all checks.
// A compiled/cached representation would reduce per-message overhead and also guard against // Caching requires language-level static const arrays -- not in current EL.
// malformed phrase JSON (json_array_len of malformed input returns 0, silently skipping all checks). // Migrate to const arrays when EL gains that feature.
// Caching requires language-level static const arrays -- not available in current EL.
// When EL gains module-level const arrays, migrate phrase lists to that form.
//
// ISSUE 5 TODO: phrase lists are rebuilt from JSON literals on every call to
// safety_any_match / safety_count_match. json_array_len of a malformed string
// returns 0, silently skipping all checks. Caching requires language-level static
// const arrays (not available in current EL). Migrate when EL gains that feature.
// Matching helpers (single loops only el escapes while-body mutation via // Matching helpers (single loops only el escapes while-body mutation via
// top-level let rebinds; nested loops would not advance) ──────────────────── // top-level let rebinds; nested loops would not advance) ────────────────────
-4
View File
@@ -104,8 +104,6 @@ fn session_create(body: String) -> String {
// Newest sessions first (prepend). // Newest sessions first (prepend).
// TODO #4: index update is read-modify-write two concurrent session_create // TODO #4: index update is read-modify-write two concurrent session_create
// calls can lose one entry. EL has no CAS primitive; fix requires runtime support. // calls can lose one entry. EL has no CAS primitive; fix requires runtime support.
// TODO(reliability #2): session_index RMW is non-atomic. Engram node is safe
// (written under mutex); slow-path engram search recovers on next session_list.
let existing_idx: String = state_get("session_index") let existing_idx: String = state_get("session_index")
let idx_entry: String = "{\"id\":\"" + id + "\",\"title\":\"" + json_safe(title) + "\",\"folder\":\"" + json_safe(folder) + "\",\"created_at\":" + int_to_str(ts) + ",\"updated_at\":" + int_to_str(ts) + ",\"last_message\":\"\"}" let idx_entry: String = "{\"id\":\"" + id + "\",\"title\":\"" + json_safe(title) + "\",\"folder\":\"" + json_safe(folder) + "\",\"created_at\":" + int_to_str(ts) + ",\"updated_at\":" + int_to_str(ts) + ",\"last_message\":\"\"}"
let new_idx: String = if str_eq(existing_idx, "") { let new_idx: String = if str_eq(existing_idx, "") {
@@ -442,8 +440,6 @@ fn session_hist_save(session_id: String, hist: String) -> Void {
} }
let oi = oi + 1 let oi = oi + 1
} }
// TODO(reliability #7): delete-then-insert is not atomic concurrent saves for the
// same session can produce orphan history nodes. State is primary truth; engram fallback.
let tags: String = "[\"session\",\"session-history\",\"Conversation\"]" let tags: String = "[\"session\",\"session-history\",\"Conversation\"]"
let discard: String = engram_node_full( let discard: String = engram_node_full(
hist, "Conversation", "session:messages:" + session_id, hist, "Conversation", "session:messages:" + session_id,
+2 -5
View File
@@ -296,11 +296,8 @@ fn layered_cycle(raw_input: String) -> String {
let cont_status: String = json_get(continuity, "status") let cont_status: String = json_get(continuity, "status")
let cont_action: String = json_get(continuity, "action") let cont_action: String = json_get(continuity, "action")
// Store continuity status so imprint can adjust its response register. // Store continuity status so imprint can adjust its response register
// TODO(reliability #4): session_continuity is process-global; scope per session_id state_set("session_continuity", cont_status)
// when available to prevent cross-session bleed under concurrent layered_cycle calls.
let cont_key: String = if str_eq(session_id, "") { "session_continuity" } else { "session_continuity:" + session_id }
state_set(cont_key, cont_status)
// Identity anomaly: add a gentle verification cue to the input before imprint // Identity anomaly: add a gentle verification cue to the input before imprint
let guided: String = if str_eq(cont_action, "identity_check") { let guided: String = if str_eq(cont_action, "identity_check") {