Knowledge nodes dominated the WM-autobiographical auto_term slot:
'Numeric tier strings...' (a Knowledge node) always scored highest
in WM and its first word 'Numeric' became the curiosity seed every
scan — activating more Numeric nodes, keeping that node in WM,
repeating indefinitely.
Fix: only derive auto_term from Memory, BacklogItem, or Entity nodes.
Knowledge nodes are reference material, not live context. Dynamic/
personal nodes carry the salience worth radiating from.
Also patches proactive_curiosity directly in dist/neuron.c (ELC
cannot compile soul.el within timeout — fallback build pattern).
The Dockerfile's --mount=type=secret path was corrupting the SA key JSON
due to control character handling differences. Pre-download soul + El SDK
in the CI workflow (using already-authenticated gcloud) and COPY them from
the build context. No credentials needed inside the Docker build.
Previous builds leave cached layers and images on the runner. Add a
docker system prune at start of deploy to avoid container-creation
failures from disk exhaustion.
--secret requires BuildKit; DOCKER_BUILDKIT=1 enables it on the legacy
Docker client. Also add GITHUB_SHA fallback and git rev-parse last-resort
so the image tag is never empty.
elb overwrites dist/soul.c with a fresh (non-inlined) compilation before
its link step fails, discarding the patched self-contained version.
Save the repo copy before elb and restore it after so the compiler always
gets the complete translation unit with all patches applied.
Modern gcloud CLI (>= 400) requires this env var so kubectl uses the
installed gke-gcloud-auth-plugin binary instead of the deprecated
application-default credentials path. Without it, kubectl commands
silently fail even after get-credentials succeeds.
Incorporates PRs #22/#23/#24:
- agentic_tools_all dedup fix (no duplicate web_search tool)
- workspace scope functions (agent_workspace_root, path_within_root, resolve_in_root)
- updated dispatch_tool with workspace confinement
- canonical-self bridge (ensure_self_canonical_bridge)
Also incorporates CI link fix from PR #26 (soul.c is self-contained, no
other dist/*.c needed). Fixes the CI build step which was compiling the
old June-16 soul.c that predated all these changes.
elb generates a dist/soul.c with all El modules inlined. Linking
dist/soul.c alone is sufficient and is exactly what the local mac
build does. Including other dist/*.c files causes two failures:
1. dist/chat.c has a capability-violation #error that fires when the
file is compiled as a utility module (outside the cgi entrypoint).
2. --allow-multiple-definition masked other issues silently.
Drop OTHER_C, drop --allow-multiple-definition, drop the now-unused
elp-c-decls.h generation step. The cc command now matches the proven
local build exactly.
The graph API resolves name=self/neuron to kn-efeb4a5b (neuron-api.el:471),
which carries only 8 incidental 'tagged' edges. The curated identity lives on
self node 015644f5 (1461 edges: identity, embodies, remembers, values). So
public self-traversal reaches tags, not the real self.
Add ensure_self_canonical_bridge(): an idempotent boot-time repair that links
kn-efeb4a5b <-> 015644f5 with a 'canonical-self' edge, only if missing. Runs in
the genesis safe-to-seed path regardless of the <100-edge gate, so the live
populated graph gets repaired and persisted. Connect-only-if-missing prevents
the duplicate-edge stacking that gates init_soul_edges().
Compile-checked with elc (darwin arm64); not link/run-gated locally. Needs a
soul build + smoke test before merge.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Confine the agentic file tools (read_file, write_file, list_files, grep)
to a configured workspace subtree via a lexical path check, and run
run_command with its cwd set to that root. Root comes from state key
"agent_workspace_root" or env NEURON_AGENT_ROOT. When no root is set,
behavior is unchanged (unscoped) for backward compatibility.
Defense-in-depth, NOT a hard boundary: the lexical guard does not resolve
symlinks and cannot stop an arbitrary shell command from cd-ing out of the
root. Real confinement needs runtime support (cwd-locked exec / sandbox-exec
/ chroot) in el_runtime.c.
Compile-checked with elc (darwin arm64); not link/run-gated locally
(darwin elb unavailable). Needs a soul build + smoke test before merge.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
agentic_tools_literal() already contains a custom web_search tool.
agentic_tools_with_web() adds the Anthropic server-side web_search_20250305
tool (also named web_search). Combining them caused Anthropic to reject
every agentic request with 'Tool names must be unique.'
agentic_tools_all() now calls agentic_tools_literal() directly. Connector
tools splice in as before. The web_search-only variant (agentic_tools_with_web)
is unchanged for callers that specifically want native search without connectors.
fix(soul): add HTTP-engram guard to safe_to_seed — when ENGRAM_URL is set
the HTTP Engram owns persistence; genesis must never save to local snapshot
regardless of node counts (was: guard_disk forced to empty string, making
the ratio check vacuously true and allowing init_soul_edges+engram_save).
fix(soul): use multiplication form for ratio guard — node_count * 16000 <
disk_len avoids floor-division truncation that underestimated boundary files
(250KB / 16000 = 15.6, floors to 15; a 15-node graph wrongly passed old guard).
fix(chat): add safety_augment_system to handle_chat_as_soul,
handle_dharma_room_turn, and handle_dharma_room_turn_agentic — all three
called the LLM without Hard Bell evaluation, leaving users in dharma rooms
without crisis resource routing.
fix(neuron-api): add api_persisted read-back to handle_api_define_process —
was the only write handler that returned ok:true without verifying the node
was actually written to engram.
fix(routes): unique temp file path in connectd_post — replaces fixed
/tmp/neuron-connectors-req.json with a timestamped path to prevent
collision if concurrency is added or two soul instances share a machine.
test: add tests/test_bell_safety.el — covers safety_detect_bell_level
(none/soft/hard), safety_classify_hard_bell (abuse/self_harm routing),
safety_normalize (smart-quote), safety_augment_system, and
handle_safety_contact_post (validation + read-back).
test: add tests/test_soul_guard.el — pure-function logic tests for the
safe_to_seed predicate: 200KB boundary, 47MB/63-node clobber scenario,
HTTP-engram mode, multiplication vs division truncation at 250KB.
test: add tests/test_api_define_process.el — verifies the define_process
write is read-back verified after the fix.
Genesis boot previously seeded a fresh identity and saved it over snapshot.json
whenever the in-memory graph looked empty. Replace the fixed node-count threshold
with a ratio guard: refuse to seed when the on-disk snapshot is large
(>200KB) but the loaded graph is sparse (< disk/16000 nodes).
KNOWN LIMITATION: this gates only the seed/pre-serve-save path. The deeper cause
is a non-atomic engram_save (fopen wb truncates to 0 before writing 47MB), which
creates a window where a concurrent load reads an empty file -> genesis -> and if
guard_disk is read in that same window the guard passes. The real fix is an
atomic engram_save (temp + fsync + rename) in el_runtime.c, tracked separately.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
BLOCKER 1: use untyped reassignment (let x = ...) for the fallback bindings
in agentic_resume instead of re-declaring typed let bindings (let x: Type = ...)
for the same variable in the same scope. The typed form risks shadowing semantics
that differ from the established pattern used everywhere else in the loop
(e.g. agentic_loop line 720).
BLOCKER 2: add empty-string guards in both bridge_save and agentic_resume.
bridge_save now returns false without writing state if messages or tools_json
is empty — preventing syntactically invalid JSON blobs. agentic_resume now
returns an error envelope after the fallback resolution if either field is
still empty, rather than passing empty strings into agentic_loop which would
silently start a fresh turn with no context.
Also add tests:
- test_bridge_serialization.el: covers bridge_save empty-guard, golden-path
raw-JSON round-trip, agentic_resume unknown/corrupt/missing-fields paths,
and legacy string-escaped fallback path
- test_sessions_routes.el: covers DELETE and PATCH /api/sessions/:id routes
(valid args, unknown id, empty body) and GET /api/sessions regression after
removal of the duplicate route_sessions() handler
When agentic_loop suspends for an MCP bridge tool it returns a
{"tool_pending":true,...} envelope with no "reply" key. Without an
explicit check, json_get(loop_result, "reply") returns "" and the
function emitted {"response":"","cgi_id":"..."} — a silent empty
response indistinguishable from a successful LLM turn with no content.
Two guards added after the existing error check:
1. tool_pending passthrough: if the loop suspended, return the pending
envelope directly so callers (dharma room orchestrators) can
distinguish suspension from failure and route to the approve flow.
2. Empty-reply guard: if final_text is empty after the pending check,
return an explicit {"error":"no response",...} envelope instead of
silently succeeding with an empty response field.
Also adds tests/test_agentic_tools.el:
- agentic_tools_all() includes all literal tool names and web_search
- connector_tools_json() returns valid JSON when bridge is down (graceful degradation)
- tool_pending envelope detection patterns (the is_pending logic)
- json_get(pending_envelope, "reply") returns "" confirming the empty-reply
guard is load-bearing (pure string/JSON, no LLM or network required)
session_delete cleared the per-session state (session_hist_ and
session_node_) but not the shared session_index cache. The next call
to session_list() hit the fast path (state_get("session_index")) and
returned the deleted session until the daemon restarted.
session_update_patch already called state_set("session_index","") to
force a re-fetch from Engram; session_delete now does the same.
Add tests/test_sessions.el covering:
- session_title_from_message (pure function, all edge cases)
- session_make_content (JSON structure and required session:meta marker)
- DELETE cache invalidation: session_index cleared, fast path disabled
- PATCH cache invalidation: stale title/folder not returned via fast path
- GET /api/sessions: session_list() fast path returns session_index
(confirms removal of the stale route_sessions() engram stub)
BLOCKER 1 (sessions.el, modern path): Add guard that rejects allow
action when tool_name is missing from the body. Previously, omitting
tool_name caused dispatch_tool("", ...) to return "unknown tool: " and
silently inject a corrupted tool_result into the conversation.
BLOCKER 2 (sessions.el, modern path): Stop re-executing client-side
tools server-side. When the client provides body["content"], use it
directly as the tool result (matching the handle_tool_result contract).
Only fall back to dispatch_tool for builtin tools when no content is
present. Non-builtin tools with no client content now return a clear
error instead of a broken dispatch attempt.
WARNING 1 (chat.el, agentic_loop): Wire always_allow_<session_id> state
into the bridge-suspension decision. When a tool is in the session's
always-allow list, treat it as locally dispatchable (like a builtin)
and skip the bridge pause, so the approval UI is never shown again for
that tool in that session.
WARNING 2 (sessions.el, legacy path): Read a "tools_variant" field from
the legacy pending blob when present, and call the corresponding
agentic_tools_*() variant on resume. Falls back to agentic_tools_literal()
for blobs written before this field existed.
tests/test_sessions_approve.el: Add 10-case test suite covering:
- empty session_id / missing call_id / missing action guards
- no pending tool returns correct error
- missing tool_name on allow returns error (BLOCKER 1)
- deny action does not require tool_name
- legacy call_id mismatch returns mismatch error
- always action records tool_name in always_allow state
- allow with client content skips re-execution (BLOCKER 2)
handle_chat_agentic was calling agentic_tools_with_web(), which omits
MCP connector tools, so mcp__* calls were never available in agentic
mode even when neuron-connectd is running.
Switch both agentic entry points to agentic_tools_all(). For
handle_dharma_room_turn_agentic, also replace the inline 8-iteration
loop with a call to agentic_loop() so bridge suspension and the full
connector tool set work consistently. Session IDs are prefixed with
'dharma:' + room_id so suspensions stay room-scoped.
bridge_save was wrapping messages and tools_json with json_safe() before
storing them as string fields. Since both are already well-formed JSON arrays
containing double quotes, json_safe added a second escape layer. agentic_resume
then called json_get() which stripped only one layer, leaving the messages array
corrupted before it was passed back into agentic_loop.
Fix: store messages as messages_raw and tools_json as tools_raw as inline raw
JSON values (unquoted), and read them back with json_get_raw. Backward
compatibility: fall back to the old string-escaped fields if the raw fields are
absent, so sessions saved before this fix can still be resumed.
Also fixes write_file returning a pre-escaped literal instead of calling
json_safe consistently with every other tool result.
The first registration called route_sessions() which searched for a
'session-start' label that no longer exists, returning an empty array
on every list request and making the sidebar appear empty after restart.
The second registration (dead code) called the correct session_list().
Removes route_sessions() entirely and the stale first route block.
Also wires up session_delete() and session_update_patch() — both existed
in sessions.el but had no HTTP routes — via new DELETE and PATCH blocks.
The approve endpoint was permanently broken for all sessions going through
the modern agentic_loop path. agentic_loop suspends via bridge_save() into
mcp_bridge:<session_id>, but handle_session_approve was reading from
pending_tool_<session_id> — a different key — so it always returned
"no pending tool for session".
Replace the body of handle_session_approve with a two-path design:
Modern path: check mcp_bridge:<session_id> first. If the blob is there,
dispatch_tool() on allow (or build the denial string), then delegate to
agentic_resume() which re-enters agentic_loop from the exact suspension
point. This is the path all live sessions take.
Legacy path: if only pending_tool_<session_id> exists (in-flight session
from before this deploy), synthesise a bridge blob from the stored
messages_so_far and route through agentic_resume() as well. The stale
inline agentic loop (90 lines, agentic_tools_literal only, no MCP
connector support, no bridge suspension) is removed entirely.
routes.el already calls handle_session_approve correctly — no change needed.
- sessions.el: new sessions module with session management and approval gate
- routes.el: wire /api/sessions routes (list, get, create, approve, tool_result)
- chat.el: thread-aware activation — short messages anchor to last reply
before engram compilation so follow-ups stay on-topic
- chat.el: agentic path tracks per-session history (session_hist_{id})
instead of shared conv_history, seeding each turn with prior context
- chat.el: add call_neuron_mcp, dispatch_tool, is_builtin_tool, next_bridge_id
agentic_loop, bridge_save, agentic_resume, handle_tool_result
- dist/soul: rebuild with all of the above
Short/ambiguous messages (< 50 chars) now use the last reply as the
engram activation seed instead of the bare message. Prevents strong
off-topic memory nodes from hijacking replies when the user is clearly
continuing an existing thread.
Also gives handle_chat_agentic session continuity: reads/writes history
keyed by session_id (falling back to global conv_history), seeds the
LLM messages array with prior turns, and saves replies back so the
next turn has context.
Applies connector-specific additions from feat/connectors-soul:
- chat.el: connector_tools_json(), agentic_tools_all(), call_mcp_bridge(),
tool_auto_approved() and mcp__ dispatch in dispatch_tool()
- routes.el: connectd_get/post, handle_connectors(), /api/connectors routing
in GET and POST sections
- MEMORY_RECALL_BUG.md: investigation notes on memory retrieval failure
The agentic loop rewrite in the source branch was not applied — it conflicts
with the tool-bridge pattern from PR #5 which is the chosen design for
client-side MCP tool execution. The connectors themselves are now fully
wired: connector tools surface as mcp__<server>__<tool> in the tools array
and dispatch to neuron-connectd via call_mcp_bridge().
Resolves conflicts by keeping main's full safety/stewardship/imprint implementations.
PR #9 uniquely contributes: layered_cycle() in soul.el, route wiring in routes.el,
soul.elh export, and the layer composition test suite.
dist/*.c and *.elh are elc transpiler output. CI's header-gen step still
greps dist/*.c, so they stay tracked, but a single soul change regenerates
~57k lines of dist/neuron.c + dist/soul.c that bury the real source diff and
poison both human and agent PR review. Mark them -diff + linguist-generated so
PRs show only the real changes. Build pipeline unchanged.
Linux elb generates individual .c files; soul.c does not contain merged
imports (unlike macOS elb which produces a unified file). Re-link all
dist/*.c manually with soul.c listed first so its real main() wins, and
--allow-multiple-definition to silence GNU ld's duplicate symbol errors.
All duplicates are identical (same El source, different compile units).