Lexer gains scan_interp_string which replaces scan_string in the main
lex loop. When no ${ is found it behaves identically to before (single
Str token). When interpolations are present it emits a flat token
sequence — Str, Plus, (expr tokens), Plus, Str, … — that the existing
parse_binop / cg_expr BinOp-Plus-string path assembles into nested
el_str_concat calls with zero parser or codegen changes.
Key design choices:
- scan_interp_brace tracks { depth so fn(a, b) inside ${} is safe
- inner expr tokens are wrapped in ( ) so operators like + in ${n+1}
do not associate with the surrounding concat Plus tokens
- \$ escapes to a literal dollar sign; bare $ not before { passes through
- empty ${} emits an empty string segment
Introduces Go-style channels as El's mid-flight communication primitive,
completing the threading model: threads can now not only spawn/join but
also communicate while running.
Part 1 — seed layer (el_runtime.c / el_runtime.h):
- Add __thread_create/__thread_join/__mutex_new/__mutex_lock/__mutex_unlock
as C seed primitives (dlsym-based thread dispatch, pthread mutex table)
- Add __channel_new/__channel_send/__channel_recv/__channel_try_recv/__channel_close
as MPMC channel seed primitives backed by mutex + condvar + circular buffer
- Bounded channels (cap > 0): circular buffer, sender blocks when full
- Unbounded channels (cap == 0): dynamic array, grows on demand, never blocks
- channel_close wakes all blocked recvers/senders; recv drains then returns ""
Part 2 — El API (runtime/channel.el):
- channel_new/send/recv/try_recv/close — thin wrappers over seed layer
- channel_pipeline — spawn N worker threads reading from in_ch, applying
fn_name, writing to out_ch; workers exit on "" sentinel from close
- channel_drain — collect all messages from a closed channel into [String]
- channel_fan_out — send a [String] list into a channel then close it
Part 3 — codegen.el:
- Register all 10 seed builtins (__thread_* + __channel_*) in builtin_arity
so the arity checker validates call sites at compile time
Provides assert_true/false, assert_eq, assert_int_eq, assert_neq,
assert_contains, assert_starts_with, assert_ends_with, and fail.
Test cases are registered by name+fn_name, executed sequentially via
the thread/dlsym dispatch mechanism, with results tracked in state_.
Includes tests/runtime/string_test.el covering all 23 string.el exports.
Add cg_match_stmt() to lower match-as-statement to proper C if/else if/else
chains. Previously, match in statement position fell through to cg_expr() which
emitted a GCC statement-expression — fine for expression arms but wrong for the
statement form. Now matched using the same dispatch pattern as If and For in the
Expr handler of cg_stmt().
Pattern dispatch mirrors cg_match (expression form):
LitStr -> str_eq(subj, EL_STR("..."))
LitInt -> subj == N
LitBool -> subj == 1 / 0
Binding -> else { el_val_t name = subj; body; }
Wildcard -> else { body; }
Subject is evaluated once into a scoped temporary to avoid double evaluation.
Adds `for i in start..end` (exclusive) and `for i in start..=end`
(inclusive) range loop syntax. Existing `for item in list` iteration
is preserved; the parser branches on DotDot/DotDotEq presence after
the start expression. Lexer adds DotDot and DotDotEq tokens with
longer-match-first priority. Codegen emits a C `for` loop with the
loop variable scoped to the statement; inclusive uses `<=`, exclusive `<`.
Lexer and parser already had Percent token and precedence on the
compiler/string-interp branch. This commit adds the missing is_int_expr
case for Percent so that modulo expressions over Int operands are
correctly typed as Int (enabling arithmetic dispatch rather than
falling through to string concat or untyped paths).
binop_to_c already mapped Percent -> % at HEAD; only is_int_expr
needed the Percent arm.
All string, I/O, math, classification, splitting, joining, counting, padding,
and URL encoding functions from el_runtime.c implemented in El using seed
primitives. No C required; compiles via the normal El pipeline.
- runtime/engram.el: thin El wrappers over all __engram_* and __generate
seed primitives (16 functions), matching the el_seed.c API exactly
- runtime/manifest.el: build manifest documenting module load order and
the cat+compile+cc command for runtime builds
- el-compiler/src/codegen.el: add 77 __-prefix seed primitive entries to
builtin_arity, covering str, fs, http, thread, exec, env, time, uuid,
math, state, html, json, and engram seeds
Migrates fs_read/write/exists/mkdir/write_bytes/list, exec/exec_bg/exec_command/exec_capture,
env/args/exit_program, state_set/get/del/keys, uuid_new/v4, and list helpers get/len from
el_runtime.c into El source as thin wrappers over seed primitives.
Introduces El's first-class threading primitives built on the seed layer's
__thread_create/__thread_join/mutex ops. parallel_map is the key deliverable:
spawns one thread per item, joins in order — replaces bash fan-out for room
dispatch and any other concurrent HTTP workload.
Thin El wrappers over seed JSON primitives (json_get, json_get_raw,
json_parse, json_stringify, json_set, json_array_len, json_array_get,
json_array_get_string) plus typed extractors (json_get_string/int/float/bool)
and pure-El builders (json_build_object, json_build_array,
json_escape_string) that require no seed call.
Thin El wrappers over seed primitives that form the public HTTP API for
El programs. Covers GET/POST/DELETE, header-map variants, binary streaming
to file, form-auth, v1/v2 server dispatch, and http_response envelope
construction. Documents two new seed primitives needed: __http_do_map and
__http_do_map_to_file (ElMap-accepting variants to avoid needing map
iteration in El).
Replace accumulate-by-concatenation loops with native_list_append + str_join.
Eliminates quadratic memory growth when processing large source files.
This is the v2 compiler state — what produced /tmp/elc-v2.
- .gitea/workflows/sdk-release.yaml: build elc from bootstrap, run tests,
publish latest release, dispatch el-sdk-updated to downstream repos
- install.sh: one-command El SDK install from Gitea release
elc-combined.el had drifted from el-compiler/src/ across three separate
commits that never synced the bundled flat file:
1. 13948f5 - fold fn main() body into C int main() + _argc/_argv rename
(codegen.el updated, elc-combined.el not updated)
2. 742bd0b - bare reassignment Assign AST node
(parser.el + codegen.el updated, elc-combined.el not updated)
3. ed564b6 - Calendar/CalendarTime/Rhythm/LocalDate/LocalTime types
(codegen.el updated, elc-combined.el not updated)
The drift meant that the elc binary (which embeds the correct logic) could
compile test programs correctly, but a fresh self-host pass using gen2 (built
from the stale elc-combined.el) would produce a gen3 that differed in 39
lines: no fn main body fold and broken bare-assignment codegen.
Fix: regenerate elc-combined.el as a flat concatenation of the current
lexer.el + parser.el + codegen.el + codegen-js.el + compiler.el source
files. Self-host fixed point verified: gen2 == gen3 byte-identical at
6450 lines.
Also rebuild dist/platform/elc and dist/platform/elc.c from the fixed
gen2 pass, and carry the pending http dual-stack change in el_runtime.c.
All tests pass: time (6/6), calendar (10/10), text (8/8), html_sanitizer (29/29).
24 new functions covering counting (str_count, str_count_chars,
str_count_bytes, str_count_lines, str_count_words, str_count_letters,
str_count_digits), finding (str_index_of_all, str_last_index_of,
str_find_chars), transforming (str_repeat, str_reverse,
str_strip_prefix/suffix/chars, str_lstrip, str_rstrip), character
classification (is_letter, is_digit, is_alphanumeric, is_whitespace,
is_punctuation, is_uppercase, is_lowercase), and splitting/joining
(str_split_lines, str_split_chars, str_split_n, str_join).
Phase 1 is byte-level + ASCII character classes. Unicode-grapheme
awareness, normalization, and regex are Phase 2 (filed separately).
Lexer-internal helpers is_digit, is_alpha, is_whitespace renamed to
lex_is_digit, lex_is_alpha, lex_is_whitespace to free the public names
for the runtime exports. The El compiler's lexer.el and the bundled
elc-combined.el both updated.
Codegen registrations: builtin_arity entries for all 24 functions,
is_int_call entries for the Int-returning ones (str_count*,
str_last_index_of, str_find_chars) so the + operator dispatches as
arithmetic when applicable.
Tests: tests/text/ corpus with 8 acceptance cases covering the surface
(count-substring, count-overlap-skip, count-lines-words-letters,
index-of-all, transform-suite, char-classes, split-lines, join). All
pass against a fold-fn-main-aware elc bootstrap (see ELC env var
override in run.sh).
Self-host fixed point: elc-combined.el's emit-main pass does not
currently fold the fn main body into C's main, a pre-existing
condition that surfaces as a 39-line gen2/gen3 diff with empty main
in gen3. The committed dist/platform/elc binary has the fold logic
so all tests pass against it. Filing the elc-combined fold-fn-main
fix separately. This commit does not introduce new self-host drift.
Phase 1.5 of time-system. Calendar is pluggable: EarthCalendar
(IANA zones, DST, Gregorian) is the default; MarsCalendar,
CycleCalendar(period), NoCycleCalendar handle non-Earth cases.
Rhythm abstracts recurrence from clock units - rhythm_cycle_phase(0.5)
means "midpoint of cycle" whether the cycle is 24 hours on Earth or
30 hours on a station or 300 years on a long-cycle world.
Phase 1 (Instant + Duration) unchanged. EarthCalendar(zone_local())
is the user-facing default; nobody who doesn't care about non-Earth
calendars sees the abstraction.
Self-host fixed point holds at 6339 lines.
Snapshot tagged at dist/platform/elc.20260502-1321-self-host.
Phase 2 (scheduling primitives every/after/at) lands next, now with
Calendar-aware grounding instead of Earth-time hardcoded.
Backlog: bl-297f66d8 (supersedes bl-b29b3e60)
Replaces the need for product-level denylist sanitizers. Small
state-machine parser; tag-and-attribute allowlist passed as JSON;
URL scheme validation on href/src attrs (http, https, mailto,
fragment, relative); whole-subtree drop for script/style/iframe/
object/embed/form (plus rarer media containers). No comment-
wrapping (was fragile to comment-injection bypass via a literal
--> inside an attacker-supplied attribute value).
Also picks up the codegen and parser changes for first-class
Instant/Duration types (postfix-literal time values, typed binop
dispatch) that were sitting in tree alongside this work.
Test corpus at tests/html_sanitizer/ covers the live attacker
probes (script, iframe, form, javascript:, about:, data:, img
onerror, onclick) plus structural attacks (comment-injection
bypass, tab-in-scheme bypass, encoded payloads, malformed input,
empty input, plain text). 29 cases, all green.
Self-host fixed point holds at 5720 lines via the canonical
el-compiler/src/compiler.el entry. Snapshot tagged at
dist/platform/elc.20260502-1249-self-host.
Backlog: bl-dc55ae07
Previous commit 6d89728 had a misleading message - the rename
itself never landed (Edit-without-Read failure cascaded silently
in the parent shell). 6d89728 incidentally captured 810 lines of
in-flight work from concurrent runtime agents and shipped it under
the wrong message; the in-flight agents will land their final
verified state on top.
This commit is just the actual rename: str_format(template, data)
to str_format(fmt, data). C++ keyword conflict resolved.
template is a reserved keyword in C++; though not in C, it blocks
this header from ever being included from C++ code. Match printf-
family convention with fmt instead.
The deeper question of whether string-template substitution is the
right abstraction for our substrate is filed separately as backlog.
1. Parser+codegen: bare reassignment `x = expr` inside an if-body
was compiling to three orphan expressions with no store. Now
emits a real assignment.
2. Runtime json_get: dot-path segments that are all digits now
correctly traverse array indices. `json_get(s, "0.field")` works.
3. Runtime HTTP writer: response bodies starting with
`{"__status__":<int>,...}` now set the HTTP status header to
that value and strip the marker from the served body. Existing
404/401/503 paths in product code now produce real status codes
instead of HTTP 200 with the status hidden in the body.
Self-host fixed point holds: gen2 == gen3 byte-identical.
Snapshot tagged at dist/platform/elc.20260502-1231-self-host.
Backlog: bl-c121edda
scan_string() is the right gate for this: every El source that embeds JS
or CSS does so as a quoted string literal, and the lexer is the single
chokepoint every backend reads. Strip there and the // line comments
and /* */ block comments never reach the parser, codegen, or the served
HTML.
looks_like_code is intentionally narrow:
- contains "<script" or "<style" (the embedded-asset case), or
- contains "function" AND ";" (a JS body without an opening tag)
Plain prose with stray // sequences passes through verbatim.
strip_code_comments tracks JS string state (single, double, backtick)
and never strips inside one. Backslash escapes inside JS strings consume
the next char verbatim. URL guard: when the char before / is ':', emit
the / literally and advance one — preserves https:// inside string
literals. Block-comment scan walks until the matching '*/' pair.
elc-cli.el is now a one-line `import "el-compiler/src/compiler.el"`
shim. Top-level `let _argv = args()` was clashing with C int main()'s
`char** _argv` parameter once compiler.el's fn main() body got folded
into C main. compiler.el owns the CLI entry point now.
Self-host fixed point reached: gen2 == gen3 byte-identical.
Tagged dist/platform/elc.20260502-1104-self-host alongside dist/platform/elc.
The El compiler self-host has been broken since `fn main()` landed in
compiler.el. Both bootstrap.py and codegen.el skipped emitting an
`el_val_t main()` (correct - it would collide with C's int main),
but neither folded the body anywhere. The C int main() got just
runtime init + return, so any El program that put its work inside
`fn main()` produced a binary that did nothing.
Fix in two places (bootstrap.py and codegen.el, kept symmetric):
1. Capture the body of `fn main()` during the FnDef pass.
2. Emit `int main(int _argc, char** _argv)` so El programs can
declare their own local `argv` / `argc` (compiler.el itself
does this) without colliding.
3. After top-level statements, fold the captured fn main body
into C main alongside them, then return 0.
Self-host fixed point reached: gen 2 and gen 3 of compiler.el's
output are byte-identical (md5 5b4eca2a...). The new elc compiles
products/web/src/main.el natively now - 24 imports resolved, 1,173
lines of C, every imported function (page_open, nav, pricing,
checkout_page, account_page, founding_badge…) emits its forward
decl + body without a concat preprocessor in sight.
Backup of the prior self-hosted binary is at
dist/platform/elc.preselfhost in case we need to fall back.
Added a typed scan function: walks the live nodes once, skips
transparent layers, keeps only entries whose node_type matches the
filter, sorts the survivors by salience, paginates. Header forward
decl in el_runtime.h so callers can find it.
Empty / NULL filter falls through to engram_scan_nodes_json so the
existing GET /api/nodes contract is preserved exactly.
This is what every list-X tool in the MCP wrapper has been wanting:
listProcesses returning only Process nodes, not all of them, without
the wrapper having to fetch + filter client-side.
Per RFC 9110 §9.3.2, HEAD must mirror GET headers + Content-Length
without sending a body. Existing http_worker / http_worker_v2 dropped
HEAD straight to the El handler, which had no idea what to do and
returned the catch-all 404 envelope. Link checkers and SEO bots saw
the 404 and reported the site as broken.
Fix layer is in the runtime, not the El handler:
* http_worker / http_worker_v2 detect HEAD before calling the
handler, dispatch as method="GET" so handler logic is unchanged,
record head_only in a thread-local, then call http_send_response.
* http_send_response reads the thread-local and skips the
final http_send_all of the body. Status line + headers +
Content-Length still go out in full.
Verified locally on engram /health: HEAD returns
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Length: 48
Connection: close
(no body — curl reports size_download=0)
compiler.el: rename `target` → `tgt` in main(); the lexer reserves
`target` as a keyword, and the let-binding position requires Ident.
The naming convention was already followed elsewhere in the file
(compile_dispatch's parameter is tgt for exactly this reason); main
was an outlier that the existing Rust-genesis-built elc happened to
parse but bootstrap.py refused, blocking self-host.
Both bootstrap.py and compiler.el now inline every imported .el file
into a single source string before lex/parse, depth-first with set
deduplication keyed on absolute path. Two forms supported:
import "path/to/file.el" (quoted relative path)
from <module> import { ... } (bare module → <module>.el)
Strict regex matching prevents false positives like CSS keyframes
("from { opacity: 0 }") embedded in El string literals - the prior
naive str.startswith pulled '{' out as a module name and tried to
load src/{.el.
This kills the bash concat preprocessor that web/build-local.sh
needed. A web full build is now just:
python3 bootstrap.py src/main.el > dist/main.c
cc -O2 ... -o dist/neuron-web dist/main.c dist/web_stubs.c \
foundation/el/el-compiler/runtime/el_runtime.c \
-lcurl -lpthread -lssl -lcrypto
Verified end-to-end: bootstrap.py produces 1,151 lines of C from
src/main.el's 24 imports, cc links a 667 KB binary.