0.0/0.0 can produce -nan on Linux/x86_64 (%g gives '-nan'),
causing the no-cycle calendar test to fail. Explicitly check isnan()
and emit 'nan' so behavior is platform-independent.
elc-bootstrap.c is stale and produces a broken gen2 that can't correctly
compile current El source (generates C without main() for user programs).
The committed elc-linux-amd64 binary is the current, correct seed for
linux CI. This removes the gen2-from-C step entirely.
compiler.el imports lexer.el/parser.el/codegen.el with bare names; those
resolve relative to the source file's directory only when the entry point
is elc-cli.el (which imports el-compiler/src/compiler.el by full path).
Compiling compiler.el directly leaves lex/parse/codegen as undefined refs.
- ci-dev.yaml: push to dev only (remove stale PR trigger)
- ci-stage.yaml: PR from dev validates, push to stage publishes to foundation-stage;
add -lm/-Wl,--allow-multiple-definition flags and all 9 native --test suites
- sdk-release.yaml: add PR to main trigger for validation, gate publish/release/dispatch
on push (post-merge) only; add -lm flags and all 9 native --test suites to main as well
Implement compile_test() entry point that emits a C test harness instead
of a normal program. Test blocks (previously skipped) now compile to
static functions with per-assertion pass/fail tracking. Assert statement
added to parser and codegen. Runtime extended with now_ns, fs_list_json,
json_build_object, json_build_array, json_escape_string, state_has,
state_get_or. Fix float negation codegen, float equality comparisons,
time_to_parts return type (JSON string), time_format empty-fmt, json_set
raw-value semantics, state_keys JSON array return. All 310 native tests
pass across 9 suites (core, text, string, math, env, state, json, time, fs).
el_strdup tracks pointers in the arena. The BFS arrays in
engram_neighbors_json are manually freed — using el_strdup caused a
double-free when the arena was later popped. Changed to plain strdup
for those allocations.
engram/dist/engram.c rebuilt from engram/src/server.el with current
elc (minor codegen diff: parenthesisation and _argc/_argv rename).
test "name" { ... } blocks were not recognized by the self-hosted
compiler. The body { } was parsed as a Map literal, creating a huge
AST with O(n²) string concatenation in the toplevel_exec_stmts loop
(which had no arena scope). A 272-line test file would consume 400MB+
and a 720-line file importing the full compiler source caused 150GB
usage and crashed the machine.
Two fixes:
1. Skip Test tokens in codegen_streaming before parse_one() —
advance past "name" and skip_to_rbrace on the body block.
Test blocks are never compiled; self-hosted compiler has no test runner.
2. Add per-statement arena scope to toplevel_exec_stmts emission loop,
matching the el_main_body loop. Frees intermediate strings after
each statement to prevent O(n²) accumulation from any unrecognized
construct that reaches that path.
Result: test_string.el (272 lines, 27 test blocks): 0MB peak (was 400MB+).
test_compiler.el (720 lines + 8728 imported): 15MB peak (was 150GB).
Wrapped compile() body in el_arena_push/pop so the arena is active
before lex() and scan_fn_sigs(). Previously both ran with
_tl_arena_active=0, leaking all token and signature strings permanently.
Also prevents inner pop(mark=0) calls from deactivating the arena
between per-function scopes. Verified: self-host PASS, RSS stable.
- Flat token list: lexer emits [kind0, val0, kind1, val1, ...] instead of [{kind,val}, ...]
Eliminates per-token ElMap allocation (~112B × N tokens)
- str_char_code hot loop: char classification via Int codes, no strdup per char
- Batch c_escape: str_slice clean runs instead of char-at per byte
- Parser updated to use tok_at/tok_kind/tok_value stride-2 accessors
Adds EL_TRUE/EL_FALSE convenience macros to el_runtime.h alongside the
existing EL_NULL, making boolean-returning builtins readable without
raw (el_val_t) casts. Documents all value macros in the header comment.
Also lands el_arena_push/el_arena_pop — a scoped string arena for CLI
programs that never call el_request_start/end. The compiler can push a
mark before a compilation unit and pop it after to free intermediate
strings, reducing peak RSS during long compile runs.
Combines two orthogonal optimizations:
1. c_escape batching (from alpha): ASCII runs emitted as str_slice segments instead
of one str_char_at string per byte. O(N) allocs → O(K) where K = special chars.
2. scan_interp_string batching (from beta): char dispatch via str_char_code (Int)
+ clean_start tracking to flush plain runs as str_slice. Eliminates per-char
string allocations in the string-literal scanning hot path.
Result on web/src/main.el: 14.5MB -> 13.4MB peak RSS (-7.6%).
Self-hosting: PASS.
Combines two orthogonal optimizations:
1. Flat token list (from beta): lex() returns [Any] with alternating kind/value
pairs instead of [Map], eliminating one ElMap per token (~3 mallocs each).
Parser updated: tok_kind(t,i) = t[2*i], tok_value(t,i) = t[2*i+1].
2. Char code dispatch (from alpha): lex() hot loop uses str_char_code -> Int
instead of str_char_at -> strdup String for all character classification.
Eliminates ~400K x 16B = 6.4MB of temporary string allocations.
scan_digits and scan_ident also updated to use str_char_code.
Result on main.el: 17.1MB -> 14.4MB peak RSS (-16%).
Self-hosting: PASS.
Replace str_char_at (returns strdup String) with str_char_code (returns Int)
in the main lex() while loop and scan_digits/scan_ident helpers.
For a 400KB combined source, str_char_at was allocating ~400K x 16B = 6.4MB
of transient 2-byte strings for the ch variable alone. str_char_code returns
an integer directly — zero allocation.
Add Int-based helpers: is_digit_code, is_alpha_code, is_ws_code,
is_alnum_or_underscore_code. Rewrite lex() operator dispatch using char
code constants (e.g. '/'=47, '"'=34, '='=61).
Result on main.el: 17.1MB -> 15.4MB peak RSS (-10%).
Self-hosting: PASS.
Brings the remaining foundation repos that were not included in the
original monorepo consolidation:
- arbor/vessels/ — 6 vessels (arbor-cli, arbor-core, arbor-diagram,
arbor-layout, arbor-parse, arbor-render) with manifests + src/main.el
- dharma/ — CGI Provenance Registry package (flat layout, 14 .el files
across registry/, sandbox/, training/, validation/, tests/)
- forge/ — consciousness channel tool (8 src .el files + new manifest.el)
- elp/src/ — 36 test fixture files not carried over in original merge
(dedup_*, realizer_*, semantics_*, morph_*, ext_*, one_extern_* helpers)
el-ide, engram, elql are already complete in ide/, engram/, ql/.
el-html is a standalone atomic emit layer; it has no runtime dependency on
el-style or el-layout (those vessels depend on el-html for SSR, not the other
way around).
- add -lm (el_runtime.c uses pow/sqrt/log/sin/cos/exp)
- add -Wl,--allow-multiple-definition to gen2 (is_digit/is_whitespace
defined in both elc-bootstrap.c and el_runtime.c; bootstrap predates
the text-processing primitives commit)
- remove colon from Self-host step name (Gitea YAML parser rejects it)
- replace em dashes in step names with hyphens
- lang/tools/install/el-install.el: El program that fetches the latest
release from the Gitea API, downloads el-sdk-latest.tar.gz, and
extracts it into ~/.el (or a custom prefix passed as argv[1])
- lang/tools/install/manifest.el: build manifest for the el-install package
- .gitea/workflows/sdk-release.yaml: build elb, epm, and el-install
binaries; bundle elc + elb + epm + runtime files into el-sdk-latest.tar.gz;
attach both the tarball and el-install binary to the Gitea release
alongside the existing per-file GCP uploads