Three improvements from today's self-review:
1. ENGRAM_BREAKTHROUGH_WEIGHT 0.25→0.10
Live data showed 524/525 WM nodes at breakthrough floor (0.25). Knowledge
nodes promoted at 0.21 decayed to 0.147 in one call, fell below the old
0.25 floor, and were immediately evicted for fresh breakthrough candidates.
Natural promotion was invisible. Invariant maintained: 0.10 < all
per-type thresholds (min=0.15 Canonical).
2. ENGRAM_WM_CAP=24 with Pass 4 (per-call) + Pass 5 (global) enforcement
Without a cap, broad queries like 'knowledge' promote 525+ nodes
simultaneously. WM is now bounded to 24 nodes. Algorithm: qsort on
promoted weights, keep top-24 by cutoff, evict the rest. Global pass
enforces cap across nodes that were promoted in prior calls and persist
via working_memory_weight. Validated: WM promoted goes 525→24.
Cognitive basis: Cowan (2001) WM ~4 chunks; 24 gives richer multi-topic
context while preventing flooding.
3. ISE exclusion from WM + /api/neuron/state-events route
InternalStateEvent nodes were reaching WM via breakthrough (5 suppression
cycles) because their content (curiosity seed JSON with 'knowledge',
'memory', etc.) triggered lexical seeding. ISEs are observability-only
and must never surface in context. Fix: guard in Pass 2 clears
suppression_count and skips to wm_weights[i]=0.0.
Also added POST /api/neuron/state-events route to server.el (auth-exempt,
internal endpoint). The main soul daemon posts ISEs here but the route
was missing — all ise_post() calls were silently returning 'not found'.
Research: SYNAPSE (arXiv 2601.02744) validates spreading factor 0.8 (our
0.7), top-M WM cap design, and cosine similarity seeding. Next priority:
implement cosine similarity initial seeding from the other branch.
Implements the accumulation layer from the Layered Consciousness architecture
(provisional 64/064,262) and answers the deferred design question. Per the spec
and Will's design: new user-facing nodes (memories, knowledge, conversations) are
created in an accumulation layer at the TOP of the consciousness stack — the engram
the user sees — while the layers below (safety, core-identity, domain, imprint,
suit) shape behavior but are hidden from the user.
- Adds ENGRAM_LAYER_ACCUMULATION (5) + the layer record in engram_init_layers
(activation_priority 50, suppressible, not injectable, transparent=0).
- engram_node and engram_node_full now assign new nodes to ENGRAM_LAYER_ACCUMULATION.
- ENGRAM_LAYER_DEFAULT stays CORE_IDENTITY ON PURPOSE: it is the fallback for LEGACY
nodes loaded from snapshots without a layer_id, so existing data (the originator
corpus) is NEVER migrated. New-nodes-only — the immutable-originator rule.
This is the foundation for fixing the identity-bleed / customer-isolation issue
(user data was landing in Neuron's core-identity layer). The retrieval-side
provenance filter (introspection should compile from accumulation, not the
originator corpus — Persona 64/036,574) is a follow-on, pending the batch-2
Layered Consciousness + Engram spec docs for exact semantics. Compiles clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Ports the fixes that until now lived only in the un-versioned el-sdk source the live
macOS soul was hand-built from (captured in the [DO NOT MERGE] live-darwin-runtime
snapshot) FORWARD onto main, faithfully and minimally — without dragging in the
snapshot's deletions of main's newer engram_wm_/engram_load_merge/http_serve_async.
1. UAF (hallucinated/lost-saves root cause): engram_new_id + engram_node_full now use
el_strdup_persist, NOT el_strdup. el_strdup tracks into the per-request arena that
el_request_end() frees when the creating HTTP request completes — leaving stored
nodes with dangling pointers (corrupted ids, 'saved but never listed'). Transplanted
verbatim from the live runtime; el_strdup_persist sites 19->27, matching live.
2. Atomic engram_save: write <path>.tmp, fflush+fsync, rename() over target (atomic on
POSIX) so a booting soul's engram_load never reads a truncated/0-byte snapshot — the
genesis -> nodes=1 -> 63-node-clobber loop. Plus a sparse-write floor: refuse to
overwrite a >200KB snapshot with one < 1/16 its size. (Validated in isolation:
harness 11/11; rebuilt+booted the darwin soul, round-tripped 5113 nodes, no clobber.)
The response-truncation fix is already on main (_tl_fs_read_len binary-safe length).
Compiles clean. For Will to build through CI/elb and deploy.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
http_handler_fn / http_handler4_fn were defined only inside el_runtime.c, so soul
modules (routes/chat/...) that reference them via cross-module forward declarations
couldn't see the types — which broke the Windows link of every module. Moving the
public function-pointer types to the shared header is the correct home and unblocks
the build on all platforms (identical typedef, C11-safe redefinition in el_runtime.c).
With this, the soul links into a native Windows neuron.exe (mingw, static) that boots
and serves HTTP on :7770 — verified /health → 200 {"status":"alive",...} in a Win11 VM.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
handle_api_consolidate writes a "SessionSummary" node, but engram_valid_node_type
omitted it — so once this validation ships, every consolidate() would be silently
REJECTED at the engram boundary. Add SessionSummary to the allowlist.
Found in Will's PR review of neuron #1 / el #52.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
llm_call_system / llm_call accepted a model argument and discarded it:
they called llm_chain_call(system, user) with no model, and the legacy
ANTHROPIC_API_KEY fallback passed NULL to llm_provider_request, so every
non-agentic chat was pinned to LLM_DEFAULT_MODEL (claude-sonnet-4-5)
regardless of the caller's selection.
Thread model_pref through llm_chain_call: provider-chain entries still
honor their own NEURON_LLM_N_MODEL override and fall back to the
requested model otherwise; the legacy Anthropic path now uses the
requested model. NULL/empty preserves prior default behavior.
Effect: the soul's model selection (state soul_model / SOUL_LLM_MODEL,
e.g. claude-opus-4-8) now reaches api.anthropic.com. Previously the
chat response echoed the selected model in its label while the request
billed Sonnet 4.5.
Not built locally (no elc/cc toolchain on this checkout); needs stage CI.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Validate UTF-8 continuation bytes in jb_emit_escaped; pass valid
sequences through and escape orphaned/invalid start bytes as \u00xx.
Pre-existing change found uncommitted in the working tree; committed
here so it is reviewable rather than lost.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The wrapper signature was stale and didn't match the C primitive
__engram_node_full(content, node_type, label, salience, importance, confidence, tier, tags).
Because el_val_t is an untyped machine word, the compiler coerced caller args to the
wrong declared param types and forwarded them BY POSITION — so tier received an int,
importance/confidence received strings, label received a float, etc. (~100 corrupt nodes).
- Correct the wrapper to match the C contract 1:1 (no coercion, no reorder).
- Add engram_valid_node_type / engram_valid_tier allowlists; engram_node and
engram_node_full now reject invalid values with __println + return "" (fail loud,
no silent malformed write).
See neuron repo: HANDOFF-engram-write-corruption.md for the full write-up + deploy runbook.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
elb.el:
- Auto-detect Homebrew OpenSSL (-L$(brew --prefix openssl)/lib) so -lssl
resolves on macOS without manual flags; no-op on Linux
- Add -include elp-c-decls.h when present in out_dir: resolves undeclared
cross-module calls in packages like ELP that lack explicit imports
ELP source:
- Add import "morphology.el" to all 29 language morphology modules
- Add language module imports to morphology.el (all langs it dispatches to)
These were missing since ELP was originally built as a monolithic unit
el_runtime.c was being compiled from source for each of the 8 native
test modules. A single precompile step produces el_runtime.o which all
8 link steps reuse — eliminates 7 redundant gcc runtime compilations.
The BASE build arg was hardcoded to ci-base:dev even when the pull fell
back to :latest. Docker then tried to resolve ci-base:dev from the
registry during the build and failed.
Capture which tag was actually pulled and use that as BASE.
parse_html_children consumed the closing `}` of the outer El function as
HTML text content when a tag was left open across a function boundary
(e.g. `page_open()` opens `<body>` without a closing `</body>`). Fix:
stop the children loop when the current token is RBrace — that token
belongs to the El function, not the HTML tree.
Add html_raw() and html_escape() builtins to el_runtime so templates
can interpolate trusted raw HTML and safely escape user-supplied content.
Rename elc-new.c → elc.c as the canonical compiler source; rebuild
elc binary from it.
Add `c_source "path"` in manifest.el build block — lets packages link
extra C files (platform stubs, native glue) without touching elb source.
On macOS, homebrew OpenSSL isn't on the default linker path. Detect it
via `brew --prefix` and inject -L/-I flags; no-op on Linux.
Rebuild elb binary; remove elc-new binary (elc is now canonical).
Rebuilt from fix/elc-oom-checkout: scan_fn_sigs_el() --emit-header path
+ el_mem_check() guard. Verified on checkout.el: all 3 sigs in .elh,
clean exit under normal load, exit(1) on memory limit exceeded.
Add el_mem_check() to el_runtime.c: reads ELC_MAX_MEM_MB (default 512),
checks RSS via getrusage (macOS bytes / Linux KB normalised to MB), prints
a clear diagnostic to stderr and exits(1) if exceeded.
Wire it into two places:
- compiler.el: upfront check at --emit-header entry point
- codegen.el: per-function check in the streaming loop after each
el_arena_pop, so runaway growth is caught at the earliest function
boundary rather than after the machine is already dying.
The --emit-header path previously called parse() which builds the entire
program AST in memory before writing the .elh file. For checkout.el (~491
lines with HTML template trees and deep BinOp string-concat chains), this
exhausted memory before the header could be written.
Fix: replace parse() + emit_header() with scan_fn_sigs_el() +
emit_header_from_sigs(). The new path tokenises the source once, then
walks the flat token list skipping over function bodies entirely — peak
memory is O(tokens) instead of O(whole-program AST).
New functions in parser.el:
- scan_type_el: reads a type annotation and returns its El source string
- scan_params_el: reads (name: Type, ...) and returns El params string
- scan_fn_sigs_el: token-level scan that collects El-style fn signatures
without building any expression AST nodes
New function in compiler.el:
- emit_header_from_sigs: writes .elh from scan_fn_sigs_el output
Self-hosting check: elc compiled with new elc, diff of outputs is
identical (zero difference).
Smoke test: elc --emit-header checkout.el produces correct three-entry
.elh (previously truncated at two entries due to mid-parse OOM).
The El lexer silently skips '#', so {#each} lexes as LBrace Ident:"each"
and {#if} lexes as LBrace If ... (using the If keyword token, not Hash).
The existing {#each} check used k2=="Hash" which was dead code.
Parser changes (parser.el):
- Add parse_raw_text_content(): collects all tokens as raw text until
</tag_name>, bypassing El expression parsing. Used for <style> and
<script> elements so CSS/JS content isn't parsed as El expressions.
- parse_html_element(): use raw-text mode for <style> and <script> tags.
- parse_html_children(): fix {#each} detection (k2=="Ident", k3=="each"
instead of dead k2=="Hash" check). Add {#if cond}...{#else}...{/if}
support generating HtmlIf AST nodes.
Codegen changes (codegen.el):
- Add cg_html_if(): generates if (cond_c) { then_c } else { else_c }
for HtmlIf nodes.
- cg_html_parts(): dispatch HtmlIf to cg_html_if.
el-install.el explicitly imported runtime/*.el modules (string, env, fs, exec,
json, http), which elb compiled to .c files in the shared dist/bin out_dir.
Linking those alongside el_runtime.c caused multiple definition errors for
every runtime function (http_get, http_patch, etc.). The runtime .el files are
thin wrappers over seed primitives already compiled into el_runtime.c — no
import needed.
Fixes:
- Remove all explicit runtime imports from el-install.el (root cause)
- Add --clean to every elb invocation in sdk-release.yaml so each build
starts with a clean out_dir (defense-in-depth against stale .c files)
- Add elb build + epm/el-install build steps to ci-dev.yaml and ci-stage.yaml
so linker errors are caught on every PR, not just stage->main
el_runtime.c uses OpenSSL (EVP_*, RAND_bytes) for AEAD encrypt/decrypt.
elb was only linking -lcurl -lpthread -lm, missing the SSL libs.
Matches the explicit flags used in ci-dev.yaml and ci-stage.yaml.