fix(elc): eliminate OOM in --emit-header; add memory guard #47

Merged
will.anderson merged 3 commits from fix/elc-oom-checkout into dev 2026-05-08 16:16:04 +00:00
Owner

Root cause

--emit-header called parse() which builds the full program AST (all function bodies) before writing the .elh. For checkout.el (~491 lines with a large HTML template tree and deep BinOp string-concat chains), this exhausted memory. The streaming codegen path was irrelevant — --emit-header never used it.

Fix

Replace parse() + emit_header() with scan_fn_sigs_el() + emit_header_from_sigs():

  • scan_fn_sigs_el() — token-level pre-scan, skips all function bodies entirely. Peak memory is O(tokens), not O(whole-program AST).
  • emit_header_from_sigs() — writes .elh directly from the scanned signatures.

The .elh only needs function signatures — there was never a reason to parse bodies for header emission.

Memory guard

Add el_mem_check() to el_runtime.c: reads ELC_MAX_MEM_MB (default 512MB), checks RSS via getrusage (macOS bytes / Linux KB normalised to MB), prints a diagnostic to stderr and exit(1) if exceeded. Wired into:

  • compiler.el: upfront check at --emit-header entry
  • codegen.el: per-function check in the streaming loop after each el_arena_pop

Prevents the process from taking the whole machine down on runaway growth.

Verification

  • checkout.el --emit-header → all 3 function sigs in .elh (was 2/3 before)
  • ELC_MAX_MEM_MB=1 elc --emit-header checkout.el → exits 1 with clear diagnostic
  • Self-hosting check: elc compiled with new elc, diff identical
## Root cause `--emit-header` called `parse()` which builds the **full program AST** (all function bodies) before writing the `.elh`. For `checkout.el` (~491 lines with a large HTML template tree and deep BinOp string-concat chains), this exhausted memory. The streaming codegen path was irrelevant — `--emit-header` never used it. ## Fix Replace `parse()` + `emit_header()` with `scan_fn_sigs_el()` + `emit_header_from_sigs()`: - `scan_fn_sigs_el()` — token-level pre-scan, skips all function bodies entirely. Peak memory is O(tokens), not O(whole-program AST). - `emit_header_from_sigs()` — writes `.elh` directly from the scanned signatures. The `.elh` only needs function signatures — there was never a reason to parse bodies for header emission. ## Memory guard Add `el_mem_check()` to `el_runtime.c`: reads `ELC_MAX_MEM_MB` (default 512MB), checks RSS via `getrusage` (macOS bytes / Linux KB normalised to MB), prints a diagnostic to stderr and `exit(1)` if exceeded. Wired into: - `compiler.el`: upfront check at `--emit-header` entry - `codegen.el`: per-function check in the streaming loop after each `el_arena_pop` Prevents the process from taking the whole machine down on runaway growth. ## Verification - `checkout.el --emit-header` → all 3 function sigs in `.elh` (was 2/3 before) - `ELC_MAX_MEM_MB=1 elc --emit-header checkout.el` → exits 1 with clear diagnostic - Self-hosting check: `elc compiled with new elc`, diff identical
will.anderson added 3 commits 2026-05-08 13:23:27 +00:00
The --emit-header path previously called parse() which builds the entire
program AST in memory before writing the .elh file. For checkout.el (~491
lines with HTML template trees and deep BinOp string-concat chains), this
exhausted memory before the header could be written.

Fix: replace parse() + emit_header() with scan_fn_sigs_el() +
emit_header_from_sigs(). The new path tokenises the source once, then
walks the flat token list skipping over function bodies entirely — peak
memory is O(tokens) instead of O(whole-program AST).

New functions in parser.el:
- scan_type_el: reads a type annotation and returns its El source string
- scan_params_el: reads (name: Type, ...) and returns El params string
- scan_fn_sigs_el: token-level scan that collects El-style fn signatures
  without building any expression AST nodes

New function in compiler.el:
- emit_header_from_sigs: writes .elh from scan_fn_sigs_el output

Self-hosting check: elc compiled with new elc, diff of outputs is
identical (zero difference).

Smoke test: elc --emit-header checkout.el produces correct three-entry
.elh (previously truncated at two entries due to mid-parse OOM).
Add el_mem_check() to el_runtime.c: reads ELC_MAX_MEM_MB (default 512),
checks RSS via getrusage (macOS bytes / Linux KB normalised to MB), prints
a clear diagnostic to stderr and exits(1) if exceeded.

Wire it into two places:
- compiler.el: upfront check at --emit-header entry point
- codegen.el: per-function check in the streaming loop after each
  el_arena_pop, so runaway growth is caught at the earliest function
  boundary rather than after the machine is already dying.
build: update dist/platform/elc with OOM fix and memory guard
El SDK CI - dev / build-and-test (pull_request) Successful in 4m16s
f5dcca0386
Rebuilt from fix/elc-oom-checkout: scan_fn_sigs_el() --emit-header path
+ el_mem_check() guard. Verified on checkout.el: all 3 sigs in .elh,
clean exit under normal load, exit(1) on memory limit exceeded.
will.anderson merged commit cff7ce072d into dev 2026-05-08 16:16:04 +00:00
Sign in to join this conversation.
No Reviewers
No labels
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: neuron-technologies/el#47