36 Commits

Author SHA1 Message Date
will.anderson 15ea584671 Merge pull request 'Fix engram_node_full field corruption + add validation' (#52) from fix/engram-node-full-field-corruption into dev
El SDK CI - dev / build-and-test (push) Successful in 7m59s
Fix engram_node_full field corruption + add validation (+ SessionSummary allowlist)
2026-06-10 22:01:41 +00:00
Tim Lingo c2afcbddf5 fix(engram): allow SessionSummary node_type in validation allowlist
El SDK CI - dev / build-and-test (pull_request) Successful in 3m47s
handle_api_consolidate writes a "SessionSummary" node, but engram_valid_node_type
omitted it — so once this validation ships, every consolidate() would be silently
REJECTED at the engram boundary. Add SessionSummary to the allowlist.

Found in Will's PR review of neuron #1 / el #52.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 06:26:25 -05:00
Tim Lingo dfe4e83ed1 Fix engram_node_full wrapper field corruption + add node_type/tier validation
El SDK Release / build-and-release (pull_request) Failing after 9s
The wrapper signature was stale and didn't match the C primitive
__engram_node_full(content, node_type, label, salience, importance, confidence, tier, tags).
Because el_val_t is an untyped machine word, the compiler coerced caller args to the
wrong declared param types and forwarded them BY POSITION — so tier received an int,
importance/confidence received strings, label received a float, etc. (~100 corrupt nodes).

- Correct the wrapper to match the C contract 1:1 (no coercion, no reorder).
- Add engram_valid_node_type / engram_valid_tier allowlists; engram_node and
  engram_node_full now reject invalid values with __println + return "" (fail loud,
  no silent malformed write).

See neuron repo: HANDOFF-engram-write-corruption.md for the full write-up + deploy runbook.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 16:13:43 -05:00
will.anderson c2cd5e01e1 fix: elb macOS OpenSSL + C master declarations header; add ELP missing imports
El SDK CI - dev / build-and-test (pull_request) Successful in 3m34s
elb.el:
- Auto-detect Homebrew OpenSSL (-L$(brew --prefix openssl)/lib) so -lssl
  resolves on macOS without manual flags; no-op on Linux
- Add -include elp-c-decls.h when present in out_dir: resolves undeclared
  cross-module calls in packages like ELP that lack explicit imports

ELP source:
- Add import "morphology.el" to all 29 language morphology modules
- Add language module imports to morphology.el (all langs it dispatches to)
  These were missing since ELP was originally built as a monolithic unit
2026-05-08 19:44:31 -05:00
will.anderson a7e6fbf2d2 feat(elc, runtime): RBrace stop in parse_html_children; html_raw/html_escape; elc.c canonical
El SDK CI - dev / build-and-test (pull_request) Successful in 4m9s
parse_html_children consumed the closing `}` of the outer El function as
HTML text content when a tag was left open across a function boundary
(e.g. `page_open()` opens `<body>` without a closing `</body>`).  Fix:
stop the children loop when the current token is RBrace — that token
belongs to the El function, not the HTML tree.

Add html_raw() and html_escape() builtins to el_runtime so templates
can interpolate trusted raw HTML and safely escape user-supplied content.

Rename elc-new.c → elc.c as the canonical compiler source; rebuild
elc binary from it.
2026-05-08 11:31:50 -05:00
will.anderson 1f4b594ae7 feat(elb): c_source manifest directive + macOS OpenSSL path detection
Add `c_source "path"` in manifest.el build block — lets packages link
extra C files (platform stubs, native glue) without touching elb source.

On macOS, homebrew OpenSSL isn't on the default linker path. Detect it
via `brew --prefix` and inject -L/-I flags; no-op on Linux.

Rebuild elb binary; remove elc-new binary (elc is now canonical).
2026-05-08 11:31:36 -05:00
will.anderson f5dcca0386 build: update dist/platform/elc with OOM fix and memory guard
El SDK CI - dev / build-and-test (pull_request) Successful in 4m16s
Rebuilt from fix/elc-oom-checkout: scan_fn_sigs_el() --emit-header path
+ el_mem_check() guard. Verified on checkout.el: all 3 sigs in .elh,
clean exit under normal load, exit(1) on memory limit exceeded.
2026-05-08 08:23:07 -05:00
will.anderson 53e0b99d5f fix(elc): add el_mem_check() memory guard — abort before OS OOM-kill
Add el_mem_check() to el_runtime.c: reads ELC_MAX_MEM_MB (default 512),
checks RSS via getrusage (macOS bytes / Linux KB normalised to MB), prints
a clear diagnostic to stderr and exits(1) if exceeded.

Wire it into two places:
- compiler.el: upfront check at --emit-header entry point
- codegen.el: per-function check in the streaming loop after each
  el_arena_pop, so runaway growth is caught at the earliest function
  boundary rather than after the machine is already dying.
2026-05-08 08:21:38 -05:00
will.anderson 5f9cad5908 fix(elc): eliminate OOM in --emit-header by using token-level signature scan
The --emit-header path previously called parse() which builds the entire
program AST in memory before writing the .elh file. For checkout.el (~491
lines with HTML template trees and deep BinOp string-concat chains), this
exhausted memory before the header could be written.

Fix: replace parse() + emit_header() with scan_fn_sigs_el() +
emit_header_from_sigs(). The new path tokenises the source once, then
walks the flat token list skipping over function bodies entirely — peak
memory is O(tokens) instead of O(whole-program AST).

New functions in parser.el:
- scan_type_el: reads a type annotation and returns its El source string
- scan_params_el: reads (name: Type, ...) and returns El params string
- scan_fn_sigs_el: token-level scan that collects El-style fn signatures
  without building any expression AST nodes

New function in compiler.el:
- emit_header_from_sigs: writes .elh from scan_fn_sigs_el output

Self-hosting check: elc compiled with new elc, diff of outputs is
identical (zero difference).

Smoke test: elc --emit-header checkout.el produces correct three-entry
.elh (previously truncated at two entries due to mid-parse OOM).
2026-05-08 08:20:13 -05:00
will.anderson f971e96dd5 fix(parser): str_join separator '' not ' ' — CSS selectors were emitting spaces between tokens
El SDK CI - dev / build-and-test (pull_request) Successful in 3m45s
2026-05-07 15:53:19 -05:00
will.anderson a3732a1e9a fix(parser): add {#if}/{#else}/{/if} support and raw-text <style>/<script> in HTML templates
El SDK CI - dev / build-and-test (pull_request) Failing after 18m3s
The El lexer silently skips '#', so {#each} lexes as LBrace Ident:"each"
and {#if} lexes as LBrace If ... (using the If keyword token, not Hash).
The existing {#each} check used k2=="Hash" which was dead code.

Parser changes (parser.el):
- Add parse_raw_text_content(): collects all tokens as raw text until
  </tag_name>, bypassing El expression parsing. Used for <style> and
  <script> elements so CSS/JS content isn't parsed as El expressions.
- parse_html_element(): use raw-text mode for <style> and <script> tags.
- parse_html_children(): fix {#each} detection (k2=="Ident", k3=="each"
  instead of dead k2=="Hash" check). Add {#if cond}...{#else}...{/if}
  support generating HtmlIf AST nodes.

Codegen changes (codegen.el):
- Add cg_html_if(): generates if (cond_c) { then_c } else { else_c }
  for HtmlIf nodes.
- cg_html_parts(): dispatch HtmlIf to cg_html_if.
2026-05-07 13:39:12 -05:00
will.anderson 027ad82db2 fix elb linker: remove runtime imports from el-install, add --clean, catch in dev/stage CI
El SDK CI - dev / build-and-test (pull_request) Successful in 3m35s
el-install.el explicitly imported runtime/*.el modules (string, env, fs, exec,
json, http), which elb compiled to .c files in the shared dist/bin out_dir.
Linking those alongside el_runtime.c caused multiple definition errors for
every runtime function (http_get, http_patch, etc.). The runtime .el files are
thin wrappers over seed primitives already compiled into el_runtime.c — no
import needed.

Fixes:
- Remove all explicit runtime imports from el-install.el (root cause)
- Add --clean to every elb invocation in sdk-release.yaml so each build
  starts with a clean out_dir (defense-in-depth against stale .c files)
- Add elb build + epm/el-install build steps to ci-dev.yaml and ci-stage.yaml
  so linker errors are caught on every PR, not just stage->main
2026-05-07 03:20:44 -05:00
will.anderson 05d717744b fix(elb): add -lssl -lcrypto to link_binary flags
El SDK CI - dev / build-and-test (pull_request) Successful in 3m24s
el_runtime.c uses OpenSSL (EVP_*, RAND_bytes) for AEAD encrypt/decrypt.
elb was only linking -lcurl -lpthread -lm, missing the SSL libs.
Matches the explicit flags used in ci-dev.yaml and ci-stage.yaml.
2026-05-07 03:03:21 -05:00
will.anderson 6f634ae432 fix(elb): use clang-only -fbracket-depth flag conditionally
El SDK CI - dev / build-and-test (pull_request) Successful in 3m26s
gcc rejects -fbracket-depth=1024 with 'unrecognized command-line option'.
Use shell subshell to probe cc --version and only pass the flag when
the compiler is clang.
2026-05-07 02:53:42 -05:00
will.anderson 61bf501b84 fix: add __http_do_map_to_file runtime primitive
El SDK CI - dev / build-and-test (pull_request) Successful in 3m41s
el-install.el generates calls to __http_do_map_to_file (HTTP request
with JSON headers map, streaming response to file). Add it to both
the HAVE_CURL implementation and the no-curl stub section.
2026-05-06 21:01:46 -05:00
will.anderson 254cbe0ac2 fix: add __-prefixed runtime primitives expected by El compiler
El SDK CI - dev / build-and-test (pull_request) Successful in 3m22s
The El compiler generates calls to __-prefixed C primitives from within
El stdlib compiled code (e.g. __println, __str_len, __json_get, etc).
These were absent from el_runtime.c, causing linker failures when
building el-install, elb, or epm with the current compiler.

Add 46 __-prefixed aliases/implementations in el_runtime.c covering:
- I/O: __println, __print, __readline
- String: __str_len, __str_cmp, __str_ncmp, __str_alloc, __str_set_char,
  __str_concat_raw, __str_slice_raw, __str_char_at, plus numeric converters
- FS: __fs_read, __fs_write, __fs_exists, __fs_mkdir, __fs_list_raw, etc
- HTTP: __http_do, __http_do_map, __http_serve, __http_serve_v2,
  __http_response, __http_sse_* (weak stubs)
- JSON: __json_get, __json_set, __json_parse_map, __json_stringify_val, etc
- State, env, exec, uuid, sha256, args
2026-05-06 20:36:49 -05:00
will.anderson 60ad7f2f6b fix: align runtime function return types with El compiler output
El SDK CI - dev / build-and-test (pull_request) Successful in 3m16s
El compiler generates calls to println, print, exit_program,
http_set_handler, http_serve, http_set_handler_v2, and http_serve_v2
as el_val_t-returning functions. The runtime declared them void,
causing conflicting-type errors when el-install.c was compiled.

Change all seven to return el_val_t (side-effect functions return 0).
Also update el_runtime.h declarations to match.
2026-05-06 20:00:40 -05:00
will.anderson 54de7d3f3f fix: add missing runtime functions for epm.el
El SDK CI - dev / build-and-test (pull_request) Successful in 3m19s
Add native_str_to_int (El compiler alias for str_to_int) and
http_post_json_with_headers (JSON POST with additional headers map)
which epm.el generates calls to but were absent from el_runtime.c.
2026-05-06 19:31:35 -05:00
will.anderson 8b074d2e39 fix: normalize NaN to 'nan' in float_to_str regardless of sign bit
El SDK CI - dev / build-and-test (pull_request) Successful in 3m18s
0.0/0.0 can produce -nan on Linux/x86_64 (%g gives '-nan'),
causing the no-cycle calendar test to fail. Explicitly check isnan()
and emit 'nan' so behavior is platform-independent.
2026-05-06 17:49:35 -05:00
will.anderson 702093e043 fix: add -lssl -lcrypto -lm to all test runner gcc commands
El SDK CI - dev / build-and-test (pull_request) Failing after 1m7s
Same OpenSSL/math linker flags needed everywhere el_runtime.c is linked.
2026-05-06 17:41:55 -05:00
Will Anderson fc6d496937 fix: correct if-stmt parser test assertions — if is an expression in El 2026-05-06 16:45:01 -05:00
Will Anderson ec889e1e53 Add --test mode to elc with Assert stmt and full native test suite passing
Implement compile_test() entry point that emits a C test harness instead
of a normal program. Test blocks (previously skipped) now compile to
static functions with per-assertion pass/fail tracking. Assert statement
added to parser and codegen. Runtime extended with now_ns, fs_list_json,
json_build_object, json_build_array, json_escape_string, state_has,
state_get_or. Fix float negation codegen, float equality comparisons,
time_to_parts return type (JSON string), time_format empty-fmt, json_set
raw-value semantics, state_keys JSON array return. All 310 native tests
pass across 9 suites (core, text, string, math, env, state, json, time, fs).
2026-05-06 14:33:47 -05:00
Will Anderson 6ced0f8009 fix: double-free in engram_neighbors_json BFS + rebuild engram.c
el_strdup tracks pointers in the arena. The BFS arrays in
engram_neighbors_json are manually freed — using el_strdup caused a
double-free when the arena was later popped. Changed to plain strdup
for those allocations.

engram/dist/engram.c rebuilt from engram/src/server.el with current
elc (minor codegen diff: parenthesisation and _argc/_argv rename).
2026-05-06 14:11:40 -05:00
Will Anderson bd7303447b fix: skip test blocks in codegen to prevent OOM on test files
test "name" { ... } blocks were not recognized by the self-hosted
compiler. The body { } was parsed as a Map literal, creating a huge
AST with O(n²) string concatenation in the toplevel_exec_stmts loop
(which had no arena scope). A 272-line test file would consume 400MB+
and a 720-line file importing the full compiler source caused 150GB
usage and crashed the machine.

Two fixes:
1. Skip Test tokens in codegen_streaming before parse_one() —
   advance past "name" and skip_to_rbrace on the body block.
   Test blocks are never compiled; self-hosted compiler has no test runner.

2. Add per-statement arena scope to toplevel_exec_stmts emission loop,
   matching the el_main_body loop. Frees intermediate strings after
   each statement to prevent O(n²) accumulation from any unrecognized
   construct that reaches that path.

Result: test_string.el (272 lines, 27 test blocks): 0MB peak (was 400MB+).
        test_compiler.el (720 lines + 8728 imported): 15MB peak (was 150GB).
2026-05-06 13:34:03 -05:00
Will Anderson e8f6765750 fix: arena leak in compile() — token/sig strings now tracked
Wrapped compile() body in el_arena_push/pop so the arena is active
before lex() and scan_fn_sigs(). Previously both ran with
_tl_arena_active=0, leaking all token and signature strings permanently.
Also prevents inner pop(mark=0) calls from deactivating the arena
between per-function scopes. Verified: self-host PASS, RSS stable.
2026-05-06 10:53:12 -05:00
Will Anderson 3726f69435 perf: 81% RSS reduction — el_release, arena scoping, streaming codegen, libcurl stub
Chain of optimizations from swarm rounds 4-7:
- Flat stride-2 token list: eliminate per-token Map allocation (~112B each × N tokens)
- Systematic el_release() in parser.el: eagerly free intermediate parse result maps
- Per-function and per-statement arena scoping in codegen_streaming()
- Streaming codegen pipeline: parse one fn at a time, emit C, discard AST
- HAVE_CURL guard: elc CLI binary drops libcurl, eliminating SSL/TLS init overhead
- HTML codegen parts-list: O(n) instead of O(n²) string growth for nested templates
- Batch c_escape: str_slice clean runs instead of char-at per byte

Result: 33.4MB → 6.5MB RSS on web/src/main.el (-81%). Self-host: PASS.
2026-05-05 20:39:38 -05:00
Will Anderson ee86736eab merge round-4-delta: flat stride-2 token list + str_char_code dispatch + batch c_escape
- Flat token list: lexer emits [kind0, val0, kind1, val1, ...] instead of [{kind,val}, ...]
  Eliminates per-token ElMap allocation (~112B × N tokens)
- str_char_code hot loop: char classification via Int codes, no strdup per char
- Batch c_escape: str_slice clean runs instead of char-at per byte
- Parser updated to use tok_at/tok_kind/tok_value stride-2 accessors
2026-05-05 20:29:35 -05:00
Will Anderson eb52be4ade runtime: add EL_TRUE/EL_FALSE macros and scoped arena for CLI
Adds EL_TRUE/EL_FALSE convenience macros to el_runtime.h alongside the
existing EL_NULL, making boolean-returning builtins readable without
raw (el_val_t) casts. Documents all value macros in the header comment.

Also lands el_arena_push/el_arena_pop — a scoped string arena for CLI
programs that never call el_request_start/end. The compiler can push a
mark before a compilation unit and pop it after to free intermediate
strings, reducing peak RSS during long compile runs.
2026-05-05 19:15:49 -05:00
Will Anderson e587bedf30 round-3-gamma: combine c_escape + scan_interp_string batching — max round-3 savings
Combines two orthogonal optimizations:
1. c_escape batching (from alpha): ASCII runs emitted as str_slice segments instead
   of one str_char_at string per byte. O(N) allocs → O(K) where K = special chars.

2. scan_interp_string batching (from beta): char dispatch via str_char_code (Int)
   + clean_start tracking to flush plain runs as str_slice. Eliminates per-char
   string allocations in the string-literal scanning hot path.

Result on web/src/main.el: 14.5MB -> 13.4MB peak RSS (-7.6%).
Self-hosting: PASS.
2026-05-05 16:01:05 -05:00
Will Anderson 1eef9928f4 round-2-gamma: combine flat token list + char code dispatch — max round-2 savings
Combines two orthogonal optimizations:
1. Flat token list (from beta): lex() returns [Any] with alternating kind/value
   pairs instead of [Map], eliminating one ElMap per token (~3 mallocs each).
   Parser updated: tok_kind(t,i) = t[2*i], tok_value(t,i) = t[2*i+1].

2. Char code dispatch (from alpha): lex() hot loop uses str_char_code -> Int
   instead of str_char_at -> strdup String for all character classification.
   Eliminates ~400K x 16B = 6.4MB of temporary string allocations.

scan_digits and scan_ident also updated to use str_char_code.

Result on main.el: 17.1MB -> 14.4MB peak RSS (-16%).
Self-hosting: PASS.
2026-05-05 15:46:20 -05:00
Will Anderson 1e67544c88 round-2-alpha: char code ops in lex() hot loop — eliminate str_char_at allocations
Replace str_char_at (returns strdup String) with str_char_code (returns Int)
in the main lex() while loop and scan_digits/scan_ident helpers.

For a 400KB combined source, str_char_at was allocating ~400K x 16B = 6.4MB
of transient 2-byte strings for the ch variable alone. str_char_code returns
an integer directly — zero allocation.

Add Int-based helpers: is_digit_code, is_alpha_code, is_ws_code,
is_alnum_or_underscore_code. Rewrite lex() operator dispatch using char
code constants (e.g. '/'=47, '"'=34, '='=61).

Result on main.el: 17.1MB -> 15.4MB peak RSS (-10%).
Self-hosting: PASS.
2026-05-05 15:43:29 -05:00
Will Anderson 2ac11a67b1 beta: replace native_string_chars with str_char_at/str_slice in lexer — 49% memory reduction on large files 2026-05-05 15:19:59 -05:00
Will Anderson 7f295bffe9 fix: codegen O(n²) HTML memory leak + elb stderr surface + runtime dir path 2026-05-05 14:40:15 -05:00
Will Anderson 962c8cbe57 dist: add linux/amd64 binaries and el_runtime.js 2026-05-05 09:44:25 -05:00
Will Anderson 592f8f482a add el-install binary and SDK bundle to release pipeline
- lang/tools/install/el-install.el: El program that fetches the latest
  release from the Gitea API, downloads el-sdk-latest.tar.gz, and
  extracts it into ~/.el (or a custom prefix passed as argv[1])
- lang/tools/install/manifest.el: build manifest for the el-install package
- .gitea/workflows/sdk-release.yaml: build elb, epm, and el-install
  binaries; bundle elc + elb + epm + runtime files into el-sdk-latest.tar.gz;
  attach both the tarball and el-install binary to the Gitea release
  alongside the existing per-file GCP uploads
2026-05-05 03:02:56 -05:00
Will Anderson 1ae68962cf restructure: move el compiler content into lang/ 2026-05-05 01:38:51 -05:00