1. parse_salience_100: handle 3+ decimal digit salience strings correctly.
The two-branch 'else { stripped }' case treated any N-digit decimal value
as hundredths, so "0.125" (stripped=125) clamped to 100 instead of 12.
Now divides by 10^(N-2) for N>2, mapping "0.125"->12, "0.375"->37, etc.
2. mem_consolidate Canonical scan: replaced single engram_scan_nodes_json(50,0)
call with a paginated loop (page_size=50, advancing offset) so Canonical nodes
beyond index 50 are no longer silently excluded from the periodic boost.
3. mem_consolidate Canonical strengthening: add salience ceiling guard so nodes
already at the runtime maximum (serialised as "1" by %g) are skipped. Prevents
monotonic unbounded salience growth across successive consolidation passes.
4. soul.el affective cutoff: replaced json_get(aff_node, "ts") with
json_get(aff_node, "created_at") / "updated_at" fallback, consistent with
handle_chat. The old "ts" field is not a standard engram node field; missing
it caused the fallback to ts_now (always passes cutoff), over-including stale
nodes. New behaviour defaults to 0 on missing timestamps (conservative exclude).
5. History byte-cap: implemented the existing TODO 32KB byte-cap. Added
hist_trim_to_byte_cap() and applied it after count-based trim in both
handle_chat and handle_chat_agentic. Prevents 100KB+ state entries at 40 turns
during long technical sessions with large assistant responses.
Fix critical float parsing bug in engram_score_node: str_replace('.','')
then str_to_int silently miscored single-decimal salience strings (0.9->9,
0.7->7, 1.0->1). Introduce parse_salience_100() which detects decimal
position and scales correctly (no decimal: *100; one decimal: *10;
two decimals: as-is).
Replace flat 30-day linear decay with tier-aware decay curves: Canonical
nodes use a 365-day window (foundational identity resists aging), Episodic
nodes use 90 days, Working/untiered keep the existing 30-day slope. Floor
stays at 10 for all tiers.
Use max(created_at, updated_at) as the recency reference so revised nodes
are not penalised for their original creation date.
Extend affective context windows from 72h/7d to 14 days across all three
paths (engram_compile, handle_chat, soul.el load_identity_context) so a
Friday crisis carries into Monday sessions and all paths present consistent
context. The 72h/7d split caused conflicting affective context between
soul.el (which loaded a 5-day-old crisis node) and chat.el (which excluded
it on subsequent turns).
Add salience evolution to mem_consolidate: strengthen top working-memory
nodes (recently recalled across sessions) and Canonical-tier nodes
(foundational identity must not decay to the floor). Previously consolidate
returned structural counts only with no salience changes.
Expand conversation window from 20 to 40 turns in both handle_chat and the
agentic history trim. Long technical sessions were losing early problem
framing at 10 user + 10 assistant pairs.
Merge propose/agent-workspace-root-read (Tim's PR #28):
- path_within_root now appends '/' to root before prefix check (closes proj_evil bypass)
- edit_file in dispatch_tool now checks agent_workspace_root() and resolves path
- handle_chat_agentic reads agent_workspace_root from request body, only persists if non-empty
- Safety screen preserved after workspace root read (conflict resolved)
Merge improve/safety-crisis-detection (PR #31): reads layered_cycle_safety_system_addendum
from state and appends to system prompt on each turn (cleared after use to prevent bleed).
Safety ts extraction falls back to updated_at. Affective prefix now wires into system build.
Conflict with PR #33 resolved: capability_rules and session_preload both preserved.
- engram_compile: BellEvent nodes do not carry created_at in the engram
node JSON; extract the unix timestamp from the embedded ' | ts:NNNNN'
pattern in the content string instead. Fall back to created_at/updated_at
if the marker is absent. Guard str_to_int against empty string so the 72h
recency check never silently treats every node as epoch-0 stale.
- auto_persist: append the current unix timestamp to the BellEvent label
('bell:soft:1749876543') to make it unique per turn. The previous label
('bell:soft') was the same for every soft bell, causing engram to treat
all subsequent writes as updates to the same node.
- Remove dead soft_bell block in layered_cycle that wrote soul_safety_system_augment
to state but was never read; safety augmentation now goes through the correct
layered_cycle_safety_system_addendum state key read by build_system_prompt
- build_system_prompt now reads layered_cycle_safety_system_addendum and appends
it to the system prompt, clearing the key after consumption
- Timestamp extraction for distress nodes falls back to updated_at when created_at
is empty, preventing the 72h recency check from always treating nodes as stale
- Issue #3: err_404/err_405 now emit HTTP 404/405 via __status__ envelope instead of HTTP 200
- Issue #4: add auth_check() function to handle_request; enforces NEURON_TOKEN on all routes except /health and /lineage
- Issue #5: missing required params now return HTTP 400 (__status__ envelope) in /api/chat (GET+POST), /imprint/contextual, /imprint/user, and handle_chat
- Issue #6: LLM unavailable in handle_chat now returns HTTP 503 instead of HTTP 200
- Issue #7: add 32 KB message size guard on POST /api/chat before engram_compile and LLM
- Issue #8: add TODO comment to route_health documenting the live-engram-query problem and the /health/deep split plan
- Issue #9: add comment to hist_trim documenting fragile str_index_of parser and silent data corruption risk
- Issue #10: add TODO comment in handle_request documenting missing per-IP rate limiting
- Issue #11: fix connectd_post temp file collision — add monotonic sequence counter so concurrent requests get unique paths
- Issue #12: fix call_mcp_bridge fixed temp file race — add monotonic sequence counter for unique paths under concurrent load
- Issues #1/#2: add TODO comment in handle_request documenting EL no-exception limitation and SIGSEGV handler gap
Issue #5: detect empty string from llm_extract_text() as an error in handle_chat,
handle_chat_as_soul, and handle_dharma_room_turn. The C runtime silently returns ""
when the LLM response content array is missing or all blocks fail to parse; without
this guard the empty string passes through to callers as a silent empty reply.
Issue #9: make agentic_loop max_tokens configurable via NEURON_LLM_MAX_TOKENS env
var (default 4096). The hardcoded value is marginal for long tool chains (8 iterations
x 4096 tokens); operators can now set 8192+ for complex multi-step tasks without
rebuilding. Non-agentic path (llm_call_system) still uses the C runtime hardcode —
that fix lives in el_runtime.c (see TODO block added in this commit).
Issue #10: increase connector_tools_json and tool_auto_approved curl --max-time from
2s to 5s to reduce false-empty tool lists when neuron-connectd is under transient
load. Graceful degradation to [] on bridge down is unchanged.
Issues #1/#2/#3/#4/#6/#8: documented as TODO comments in chat.el. These require
targeted C runtime changes in el_runtime.c (llm_provider_request retry loop,
EL_LLM_TIMEOUT_MS separation, HTTP 429 backoff, 5xx retry, EL_HTTP_MAX_RESPONSE_BYTES
cap). Architectural decisions recorded so they are traceable to root causes.
- sessions.el: add session_exists() for chat-path session validation (ISSUE #6/#7)
- sessions.el: add session_create_cleanup() for ghost-session rollback (ISSUE #1)
- sessions.el: set session_pending_first_msg flag in session_create; clear it in
session_hist_save so the first successful chat marks the session active (ISSUE #1)
- sessions.el: session_delete now clears mcp_bridge:<id> and always_allow_<id>
state keys so abandoned pending-tool sessions do not accumulate (ISSUE #5)
- sessions.el: add TODO comments for ISSUE #2 (no TTL/expiry), ISSUE #3
(non-atomic delete-then-create), ISSUE #4 (no concurrent-create guard),
and ISSUE #8 (reconnect/duplicate resume race) where fixes are too invasive
to land without new runtime primitives
- chat.el: validate session_id exists via session_exists() before entering
agentic_loop; unknown session_ids now return a 404-style error instead of
silently starting a fresh empty session (ISSUE #6/#7)
Every engram_node_full call that dropped its return value now binds it
and emits a println on empty string. engram_save calls in consolidate,
heartbeat, and dharma-room-turn are checked for failure. The two API
handlers (log_state_event, tune_config) that skipped api_persisted()
now match the read-back-after-write contract used everywhere else in
neuron-api.el.
Files changed:
- chat.el: conv_history_persist, handle_dharma_room_turn, auto_persist
- soul.el: emit_session_start_event, seed_persona_from_env HTTP check
- memory.el: mem_save, mem_boot_count_inc
- neuron-api.el: handle_api_log_state_event, handle_api_tune_config,
handle_api_consolidate (engram_save + session summary write)
- awareness.el: ise_post local-engram fallback path
TODO comments added for non-atomic patterns (issues #12, #13) and
the missing circuit breaker (#14) — these require new primitives.
- Fix state key mismatch: soul.el layered_cycle now reads conv_history
(not conversation_history), unblocking the safety_score_distress_history
history-amplification path in safety_threat_score
- Add safety_augment_system call on the main handle_chat path so the
phrase-list bell detector fires on all chat turns, not just dharma rooms
- Add cross-session affective engram query in load_identity_context() at
boot; stores distress/crisis signals from prior sessions under
soul_affective_context with a 7-day soft recency filter
Issues addressed:
- #1 ASYMMETRIC PERSIST/LOAD: conv_history_load() now tries engram_get_node_by_label()
first (symmetric with the label-based write), falling back to vector search only when
label lookup returns nothing. Immune to cold/corrupt vector index.
- #2 SILENT LOAD FAILURE: all failure paths in conv_history_load() and conv_history_persist()
now emit a println log line rather than silently returning "" or dropping writes.
- #3 NO RECOVERY PATH: documented as TODO with explanation of why a full recovery path
(retry, ID fallback, orphan cleanup) is too invasive for a targeted fix here.
- #4 OVERWRITE WITHOUT DELETE: documented with TODO to replace engram_node_full with
explicit delete-then-create once engram exposes a label-scoped delete API.
- #5/#10 BROKEN TRIM / OFF-BY-ONE: hist_trim() rewritten to use json_array_len /
json_array_get (structural JSON ops) instead of raw str_index_of scanning for
'{"role":' markers. Immune to marker strings appearing inside message content.
Minimum retained count guard added: never trims below 2 entries.
- #6 PARTIAL-WRITE GUARD: conv_history_persist() refuses to write a blob that doesn't
contain both '[' and ']'. conv_history_load() requires both before accepting content.
- #7 DUAL STORAGE: documented with a comment at the persist call site.
- #8 NO MAX SIZE GUARD: documented as TODO with rationale for why a byte-length cap
requires a more invasive change (entry truncation or summarisation).
- #9 AGENTIC HISTORY NOT PERSISTED: handle_chat_agentic() now calls conv_history_persist()
for the default global session (hist_key == "conv_history") after updating state,
matching the non-agentic path's durability. Named sessions remain in-process only.
Add strip -s after gcc compilation to remove symbol table and relocation info.
Reduces binary size and prevents symbol-level reverse engineering of EL runtime internals.
- auto_persist: detect bell level (soft/hard) on every user message using
safety_detect_bell_level; write a dedicated BellEvent engram node with
calibrated salience alongside the Conversation node when a bell fires.
Tag the Conversation node with bell:soft/bell:hard and 'affective' for
direct discovery without scanning all chat nodes.
- auto_persist: track per-session bell count, dominant level, and last
signal in state (session_bell_count/level/signal keys) so downstream
functions can act on the emotional history without re-scanning engram.
- engram_compile: include the top-1 most recent BellEvent node within 72h
in every context build. Distress context from earlier turns (same or
recent session) automatically travels into all subsequent LLM calls.
- hist_trim_with_bell_guard: replace hist_trim at the handle_chat call site.
Before evicting the oldest turn from the 20-turn window, inspect the user
message for bell signals. If a bell was present, write a preservation
BellEvent to engram before dropping the turn so the full message survives
the rolling window.
- session_hist_save: after writing the history node, check session bell
counters. On the first save where bell_count > 0, write a
session:emotional-summary BellEvent node with distress signal, count,
and dominant level. A state flag prevents duplicate writes on subsequent
saves in the same session.
- engram_compile: rank search results by recency x relevance before including
in context. Pulls 20 candidates, scores each (salience * importance * recency
decay), keeps top 8. Eliminates stale/low-signal nodes that diluted context.
- handle_chat: on hist_len==0 (session start), proactively load user profile
and active-work context from engram and inject as brief bullets in the system
prompt. Gives the soul grounding before any conversation history exists.
- build_system_prompt: add [CAPABILITY GAPS] directive instructing the soul to
offer partial help and reasoning instead of flat "I don't have access to that"
refusals when a tool is missing.
- handle_chat_agentic: run safety_screen at entry, mirroring layered_cycle.
Hard bell exits immediately with the crisis response without entering the loop.
- agentic_loop: surface the 8-iteration cap explicitly in the error envelope
("agentic loop hit the 8-iteration cap...") rather than the opaque "no response".
Add iterations count to both the error and success envelopes for observability.
- Add per-IP in-memory rate limiter (60 req/min default, configurable via
soul_rate_limit state key; /health exempt; loopback callers skipped)
- Extend /health with uptime_secs (from soul_boot_ts) and live LLM probe
- Add missing_param 400 guard on POST /api/chat before passing to LLM
- Standardise error envelopes: add "code" field to err_404/err_405 and all
missing-param returns; route_synthesize now errors clearly instead of
returning the misleading {"mechanism":"did not engage"} on bad input
- Document streaming gap in /api/chat (SSE not implemented, note added)
- handle_request gains ip param; rate_limit_check wired at entry point
- soul.el: fix state key bug in layered_cycle (conversation_history -> conv_history)
- safety.el: add indirect crisis location patterns to soft_bell phrase list
- soul.el: wire safety_augment_system into layered_cycle for soft_bell turns
- chat.el: load cross-session affective context at session start when distress signals found within 72h
REPRODUCED: in the non-agentic path (Tools off / 'Just chat'), asking for
tool-work makes the model role-play tool use — it emits a fake ```json {...}```
'tool call' and says 'let me search/query/pull your sessions' while NOTHING
runs. Reads as a broken/lying app. (The agentic path is fine: verified it
calls search_memory and reports honestly.)
Root cause: build_system_prompt (handle_chat, the tool-less path) never told
the model it has no tools this turn, so it fabricated.
Fix: add a NO-TOOLS directive to the non-agentic system prompt — never emit
tool calls / JSON tool blocks / 'let me pull...' narration; answer from context
only; if a tool is truly needed, say so in one sentence and tell the user to
turn Tools on. Applied to chat.el (source) AND dist/soul.c (the curated TU the
CI compiles), so the CI-built binary carries it.
Verified the FABRICATION repro on the live local soul; could not verify the
patched binary locally (no matching el-runtime version on this machine — a
hand-link against origin/main runtime 404s on all routes). Builds correctly via
CI, which links soul.c against the pinned runtime.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the UI<->soul contract for #23 (scope file/command tools to an agent
workspace root). #23 made the tools read state_get("agent_workspace_root"), but
nothing set that key from the desktop UI, so the agent panel's Workspace Folder
was cosmetic and tools ran unscoped (default-allow). This reads the root the UI
now sends on each agentic request and state_sets it before tool dispatch, so
agent_workspace_root() picks it up for the turn.
Minimal + pattern-matching (same json_get/state_set shape used throughout chat.el).
Empty body field => unscoped (backward-compatible) and preserves the env fallback.
FOR WILL'S REVIEW — do not merge without sign-off:
- Ownership model: set state from body each turn (so clearing the folder un-scopes)
vs. only-when-nonempty. Flagged inline.
- Pairs with neuron-ui PR #32 (ChatRequest.agentWorkspaceRoot).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Knowledge nodes dominated the WM-autobiographical auto_term slot:
'Numeric tier strings...' (a Knowledge node) always scored highest
in WM and its first word 'Numeric' became the curiosity seed every
scan — activating more Numeric nodes, keeping that node in WM,
repeating indefinitely.
Fix: only derive auto_term from Memory, BacklogItem, or Entity nodes.
Knowledge nodes are reference material, not live context. Dynamic/
personal nodes carry the salience worth radiating from.
Also patches proactive_curiosity directly in dist/neuron.c (ELC
cannot compile soul.el within timeout — fallback build pattern).
The Dockerfile's --mount=type=secret path was corrupting the SA key JSON
due to control character handling differences. Pre-download soul + El SDK
in the CI workflow (using already-authenticated gcloud) and COPY them from
the build context. No credentials needed inside the Docker build.
Previous builds leave cached layers and images on the runner. Add a
docker system prune at start of deploy to avoid container-creation
failures from disk exhaustion.
--secret requires BuildKit; DOCKER_BUILDKIT=1 enables it on the legacy
Docker client. Also add GITHUB_SHA fallback and git rev-parse last-resort
so the image tag is never empty.
elb overwrites dist/soul.c with a fresh (non-inlined) compilation before
its link step fails, discarding the patched self-contained version.
Save the repo copy before elb and restore it after so the compiler always
gets the complete translation unit with all patches applied.