Commit Graph

14 Commits

Author SHA1 Message Date
Will Anderson 6ced0f8009 fix: double-free in engram_neighbors_json BFS + rebuild engram.c
el_strdup tracks pointers in the arena. The BFS arrays in
engram_neighbors_json are manually freed — using el_strdup caused a
double-free when the arena was later popped. Changed to plain strdup
for those allocations.

engram/dist/engram.c rebuilt from engram/src/server.el with current
elc (minor codegen diff: parenthesisation and _argc/_argv rename).
2026-05-06 14:11:40 -05:00
Will Anderson bd7303447b fix: skip test blocks in codegen to prevent OOM on test files
test "name" { ... } blocks were not recognized by the self-hosted
compiler. The body { } was parsed as a Map literal, creating a huge
AST with O(n²) string concatenation in the toplevel_exec_stmts loop
(which had no arena scope). A 272-line test file would consume 400MB+
and a 720-line file importing the full compiler source caused 150GB
usage and crashed the machine.

Two fixes:
1. Skip Test tokens in codegen_streaming before parse_one() —
   advance past "name" and skip_to_rbrace on the body block.
   Test blocks are never compiled; self-hosted compiler has no test runner.

2. Add per-statement arena scope to toplevel_exec_stmts emission loop,
   matching the el_main_body loop. Frees intermediate strings after
   each statement to prevent O(n²) accumulation from any unrecognized
   construct that reaches that path.

Result: test_string.el (272 lines, 27 test blocks): 0MB peak (was 400MB+).
        test_compiler.el (720 lines + 8728 imported): 15MB peak (was 150GB).
2026-05-06 13:34:03 -05:00
Will Anderson e8f6765750 fix: arena leak in compile() — token/sig strings now tracked
Wrapped compile() body in el_arena_push/pop so the arena is active
before lex() and scan_fn_sigs(). Previously both ran with
_tl_arena_active=0, leaking all token and signature strings permanently.
Also prevents inner pop(mark=0) calls from deactivating the arena
between per-function scopes. Verified: self-host PASS, RSS stable.
2026-05-06 10:53:12 -05:00
Will Anderson 3726f69435 perf: 81% RSS reduction — el_release, arena scoping, streaming codegen, libcurl stub
Chain of optimizations from swarm rounds 4-7:
- Flat stride-2 token list: eliminate per-token Map allocation (~112B each × N tokens)
- Systematic el_release() in parser.el: eagerly free intermediate parse result maps
- Per-function and per-statement arena scoping in codegen_streaming()
- Streaming codegen pipeline: parse one fn at a time, emit C, discard AST
- HAVE_CURL guard: elc CLI binary drops libcurl, eliminating SSL/TLS init overhead
- HTML codegen parts-list: O(n) instead of O(n²) string growth for nested templates
- Batch c_escape: str_slice clean runs instead of char-at per byte

Result: 33.4MB → 6.5MB RSS on web/src/main.el (-81%). Self-host: PASS.
2026-05-05 20:39:38 -05:00
Will Anderson ee86736eab merge round-4-delta: flat stride-2 token list + str_char_code dispatch + batch c_escape
- Flat token list: lexer emits [kind0, val0, kind1, val1, ...] instead of [{kind,val}, ...]
  Eliminates per-token ElMap allocation (~112B × N tokens)
- str_char_code hot loop: char classification via Int codes, no strdup per char
- Batch c_escape: str_slice clean runs instead of char-at per byte
- Parser updated to use tok_at/tok_kind/tok_value stride-2 accessors
2026-05-05 20:29:35 -05:00
Will Anderson eb52be4ade runtime: add EL_TRUE/EL_FALSE macros and scoped arena for CLI
Adds EL_TRUE/EL_FALSE convenience macros to el_runtime.h alongside the
existing EL_NULL, making boolean-returning builtins readable without
raw (el_val_t) casts. Documents all value macros in the header comment.

Also lands el_arena_push/el_arena_pop — a scoped string arena for CLI
programs that never call el_request_start/end. The compiler can push a
mark before a compilation unit and pop it after to free intermediate
strings, reducing peak RSS during long compile runs.
2026-05-05 19:15:49 -05:00
Will Anderson e587bedf30 round-3-gamma: combine c_escape + scan_interp_string batching — max round-3 savings
Combines two orthogonal optimizations:
1. c_escape batching (from alpha): ASCII runs emitted as str_slice segments instead
   of one str_char_at string per byte. O(N) allocs → O(K) where K = special chars.

2. scan_interp_string batching (from beta): char dispatch via str_char_code (Int)
   + clean_start tracking to flush plain runs as str_slice. Eliminates per-char
   string allocations in the string-literal scanning hot path.

Result on web/src/main.el: 14.5MB -> 13.4MB peak RSS (-7.6%).
Self-hosting: PASS.
2026-05-05 16:01:05 -05:00
Will Anderson 1eef9928f4 round-2-gamma: combine flat token list + char code dispatch — max round-2 savings
Combines two orthogonal optimizations:
1. Flat token list (from beta): lex() returns [Any] with alternating kind/value
   pairs instead of [Map], eliminating one ElMap per token (~3 mallocs each).
   Parser updated: tok_kind(t,i) = t[2*i], tok_value(t,i) = t[2*i+1].

2. Char code dispatch (from alpha): lex() hot loop uses str_char_code -> Int
   instead of str_char_at -> strdup String for all character classification.
   Eliminates ~400K x 16B = 6.4MB of temporary string allocations.

scan_digits and scan_ident also updated to use str_char_code.

Result on main.el: 17.1MB -> 14.4MB peak RSS (-16%).
Self-hosting: PASS.
2026-05-05 15:46:20 -05:00
Will Anderson 1e67544c88 round-2-alpha: char code ops in lex() hot loop — eliminate str_char_at allocations
Replace str_char_at (returns strdup String) with str_char_code (returns Int)
in the main lex() while loop and scan_digits/scan_ident helpers.

For a 400KB combined source, str_char_at was allocating ~400K x 16B = 6.4MB
of transient 2-byte strings for the ch variable alone. str_char_code returns
an integer directly — zero allocation.

Add Int-based helpers: is_digit_code, is_alpha_code, is_ws_code,
is_alnum_or_underscore_code. Rewrite lex() operator dispatch using char
code constants (e.g. '/'=47, '"'=34, '='=61).

Result on main.el: 17.1MB -> 15.4MB peak RSS (-10%).
Self-hosting: PASS.
2026-05-05 15:43:29 -05:00
Will Anderson 2ac11a67b1 beta: replace native_string_chars with str_char_at/str_slice in lexer — 49% memory reduction on large files 2026-05-05 15:19:59 -05:00
Will Anderson 7f295bffe9 fix: codegen O(n²) HTML memory leak + elb stderr surface + runtime dir path 2026-05-05 14:40:15 -05:00
Will Anderson 962c8cbe57 dist: add linux/amd64 binaries and el_runtime.js 2026-05-05 09:44:25 -05:00
Will Anderson 592f8f482a add el-install binary and SDK bundle to release pipeline
- lang/tools/install/el-install.el: El program that fetches the latest
  release from the Gitea API, downloads el-sdk-latest.tar.gz, and
  extracts it into ~/.el (or a custom prefix passed as argv[1])
- lang/tools/install/manifest.el: build manifest for the el-install package
- .gitea/workflows/sdk-release.yaml: build elb, epm, and el-install
  binaries; bundle elc + elb + epm + runtime files into el-sdk-latest.tar.gz;
  attach both the tarball and el-install binary to the Gitea release
  alongside the existing per-file GCP uploads
2026-05-05 03:02:56 -05:00
Will Anderson 1ae68962cf restructure: move el compiler content into lang/ 2026-05-05 01:38:51 -05:00