parse_html_children consumed the closing `}` of the outer El function as
HTML text content when a tag was left open across a function boundary
(e.g. `page_open()` opens `<body>` without a closing `</body>`). Fix:
stop the children loop when the current token is RBrace — that token
belongs to the El function, not the HTML tree.
Add html_raw() and html_escape() builtins to el_runtime so templates
can interpolate trusted raw HTML and safely escape user-supplied content.
Rename elc-new.c → elc.c as the canonical compiler source; rebuild
elc binary from it.
Add el_mem_check() to el_runtime.c: reads ELC_MAX_MEM_MB (default 512),
checks RSS via getrusage (macOS bytes / Linux KB normalised to MB), prints
a clear diagnostic to stderr and exits(1) if exceeded.
Wire it into two places:
- compiler.el: upfront check at --emit-header entry point
- codegen.el: per-function check in the streaming loop after each
el_arena_pop, so runaway growth is caught at the earliest function
boundary rather than after the machine is already dying.
The --emit-header path previously called parse() which builds the entire
program AST in memory before writing the .elh file. For checkout.el (~491
lines with HTML template trees and deep BinOp string-concat chains), this
exhausted memory before the header could be written.
Fix: replace parse() + emit_header() with scan_fn_sigs_el() +
emit_header_from_sigs(). The new path tokenises the source once, then
walks the flat token list skipping over function bodies entirely — peak
memory is O(tokens) instead of O(whole-program AST).
New functions in parser.el:
- scan_type_el: reads a type annotation and returns its El source string
- scan_params_el: reads (name: Type, ...) and returns El params string
- scan_fn_sigs_el: token-level scan that collects El-style fn signatures
without building any expression AST nodes
New function in compiler.el:
- emit_header_from_sigs: writes .elh from scan_fn_sigs_el output
Self-hosting check: elc compiled with new elc, diff of outputs is
identical (zero difference).
Smoke test: elc --emit-header checkout.el produces correct three-entry
.elh (previously truncated at two entries due to mid-parse OOM).
The El lexer silently skips '#', so {#each} lexes as LBrace Ident:"each"
and {#if} lexes as LBrace If ... (using the If keyword token, not Hash).
The existing {#each} check used k2=="Hash" which was dead code.
Parser changes (parser.el):
- Add parse_raw_text_content(): collects all tokens as raw text until
</tag_name>, bypassing El expression parsing. Used for <style> and
<script> elements so CSS/JS content isn't parsed as El expressions.
- parse_html_element(): use raw-text mode for <style> and <script> tags.
- parse_html_children(): fix {#each} detection (k2=="Ident", k3=="each"
instead of dead k2=="Hash" check). Add {#if cond}...{#else}...{/if}
support generating HtmlIf AST nodes.
Codegen changes (codegen.el):
- Add cg_html_if(): generates if (cond_c) { then_c } else { else_c }
for HtmlIf nodes.
- cg_html_parts(): dispatch HtmlIf to cg_html_if.
el-install.el generates calls to __http_do_map_to_file (HTTP request
with JSON headers map, streaming response to file). Add it to both
the HAVE_CURL implementation and the no-curl stub section.
El compiler generates calls to println, print, exit_program,
http_set_handler, http_serve, http_set_handler_v2, and http_serve_v2
as el_val_t-returning functions. The runtime declared them void,
causing conflicting-type errors when el-install.c was compiled.
Change all seven to return el_val_t (side-effect functions return 0).
Also update el_runtime.h declarations to match.
Add native_str_to_int (El compiler alias for str_to_int) and
http_post_json_with_headers (JSON POST with additional headers map)
which epm.el generates calls to but were absent from el_runtime.c.
0.0/0.0 can produce -nan on Linux/x86_64 (%g gives '-nan'),
causing the no-cycle calendar test to fail. Explicitly check isnan()
and emit 'nan' so behavior is platform-independent.
Implement compile_test() entry point that emits a C test harness instead
of a normal program. Test blocks (previously skipped) now compile to
static functions with per-assertion pass/fail tracking. Assert statement
added to parser and codegen. Runtime extended with now_ns, fs_list_json,
json_build_object, json_build_array, json_escape_string, state_has,
state_get_or. Fix float negation codegen, float equality comparisons,
time_to_parts return type (JSON string), time_format empty-fmt, json_set
raw-value semantics, state_keys JSON array return. All 310 native tests
pass across 9 suites (core, text, string, math, env, state, json, time, fs).
el_strdup tracks pointers in the arena. The BFS arrays in
engram_neighbors_json are manually freed — using el_strdup caused a
double-free when the arena was later popped. Changed to plain strdup
for those allocations.
engram/dist/engram.c rebuilt from engram/src/server.el with current
elc (minor codegen diff: parenthesisation and _argc/_argv rename).
test "name" { ... } blocks were not recognized by the self-hosted
compiler. The body { } was parsed as a Map literal, creating a huge
AST with O(n²) string concatenation in the toplevel_exec_stmts loop
(which had no arena scope). A 272-line test file would consume 400MB+
and a 720-line file importing the full compiler source caused 150GB
usage and crashed the machine.
Two fixes:
1. Skip Test tokens in codegen_streaming before parse_one() —
advance past "name" and skip_to_rbrace on the body block.
Test blocks are never compiled; self-hosted compiler has no test runner.
2. Add per-statement arena scope to toplevel_exec_stmts emission loop,
matching the el_main_body loop. Frees intermediate strings after
each statement to prevent O(n²) accumulation from any unrecognized
construct that reaches that path.
Result: test_string.el (272 lines, 27 test blocks): 0MB peak (was 400MB+).
test_compiler.el (720 lines + 8728 imported): 15MB peak (was 150GB).
Wrapped compile() body in el_arena_push/pop so the arena is active
before lex() and scan_fn_sigs(). Previously both ran with
_tl_arena_active=0, leaking all token and signature strings permanently.
Also prevents inner pop(mark=0) calls from deactivating the arena
between per-function scopes. Verified: self-host PASS, RSS stable.
- Flat token list: lexer emits [kind0, val0, kind1, val1, ...] instead of [{kind,val}, ...]
Eliminates per-token ElMap allocation (~112B × N tokens)
- str_char_code hot loop: char classification via Int codes, no strdup per char
- Batch c_escape: str_slice clean runs instead of char-at per byte
- Parser updated to use tok_at/tok_kind/tok_value stride-2 accessors
Adds EL_TRUE/EL_FALSE convenience macros to el_runtime.h alongside the
existing EL_NULL, making boolean-returning builtins readable without
raw (el_val_t) casts. Documents all value macros in the header comment.
Also lands el_arena_push/el_arena_pop — a scoped string arena for CLI
programs that never call el_request_start/end. The compiler can push a
mark before a compilation unit and pop it after to free intermediate
strings, reducing peak RSS during long compile runs.
Combines two orthogonal optimizations:
1. c_escape batching (from alpha): ASCII runs emitted as str_slice segments instead
of one str_char_at string per byte. O(N) allocs → O(K) where K = special chars.
2. scan_interp_string batching (from beta): char dispatch via str_char_code (Int)
+ clean_start tracking to flush plain runs as str_slice. Eliminates per-char
string allocations in the string-literal scanning hot path.
Result on web/src/main.el: 14.5MB -> 13.4MB peak RSS (-7.6%).
Self-hosting: PASS.
Combines two orthogonal optimizations:
1. Flat token list (from beta): lex() returns [Any] with alternating kind/value
pairs instead of [Map], eliminating one ElMap per token (~3 mallocs each).
Parser updated: tok_kind(t,i) = t[2*i], tok_value(t,i) = t[2*i+1].
2. Char code dispatch (from alpha): lex() hot loop uses str_char_code -> Int
instead of str_char_at -> strdup String for all character classification.
Eliminates ~400K x 16B = 6.4MB of temporary string allocations.
scan_digits and scan_ident also updated to use str_char_code.
Result on main.el: 17.1MB -> 14.4MB peak RSS (-16%).
Self-hosting: PASS.
Replace str_char_at (returns strdup String) with str_char_code (returns Int)
in the main lex() while loop and scan_digits/scan_ident helpers.
For a 400KB combined source, str_char_at was allocating ~400K x 16B = 6.4MB
of transient 2-byte strings for the ch variable alone. str_char_code returns
an integer directly — zero allocation.
Add Int-based helpers: is_digit_code, is_alpha_code, is_ws_code,
is_alnum_or_underscore_code. Rewrite lex() operator dispatch using char
code constants (e.g. '/'=47, '"'=34, '='=61).
Result on main.el: 17.1MB -> 15.4MB peak RSS (-10%).
Self-hosting: PASS.