Files
el/lang/spec/codegen-js.md
2026-05-05 01:38:51 -05:00

24 KiB

El JavaScript Backend (codegen-js)

Status: Phase 5 complete. ~90% language coverage. Full browser JavaScript can be expressed structurally in El without any native_js escape hatches. All additions since Phase 4: anonymous function literals (lambda syntax), try/catch statement, extern fn declarations, direct JS method call syntax on Any-typed values, Promise helpers, Object/Array utilities, and URL import declarations. Proof: examples/browser-auth.el is a complete Supabase auth flow with zero native_js or native_js_call calls.

Authoritative files

File Role
el-compiler/src/codegen-js.el El → JS code generator (mirrors codegen.el)
el-compiler/runtime/el_runtime.js Browser/Node runtime that compiled programs link against
el-compiler/src/compiler.el Adds compile_js() and --target=js CLI dispatch
spec/codegen-js.md This document

1. Why a JS backend exists

El compiles to C today. C is the right substrate for the agent runtime, the DHARMA daemon, and Engram. But three first-class consumers of El need to run in a browser, where C is not an option:

  1. el-ui/runtime/ — the activation-based frontend framework written in JS. The long-term plan is to author components and the runtime itself in El and compile them down to JS.
  2. cgi-studio — the web app for cultivating CGIs. Today it is hand-written JS. Once the JS backend is mature, the studio's UI logic can be authored in El and share types/identifier names with the CGI it cultivates.
  3. Marketplace plugin UIs — third parties writing browser-side El that runs untrusted in a sandbox. They need a JS target.

A secondary motivation: El-on-Node. CLI tooling, build scripts, and tests benefit from a tight el → js → node cycle without a cc step.


2. Type representation strategy

The C backend pretends every value is int64_t. That is a deliberate runtime trick to avoid dynamic dispatch in generated C. JavaScript already has tagged dynamic values, so the JS backend is simpler: every El value is a native JS value, and the tag of el_val_t collapses into the JS type system.

El type C representation JS representation
Int int64_t (direct) number (with Number.isSafeInteger caveat — see §6)
Float int64_t bit-cast of double via el_from_float number (no bit-cast — JS number IS a double)
Bool int64_t, 0 = false, nonzero = true boolean
String (int64_t)(uintptr_t)cstring string
Void C void undefined
[T] (List) el_val_t pointer to refcounted struct Array<any>
Map<K,V> el_val_t pointer to refcounted struct plain object {[key]: any}
EL_NULL (0) (el_val_t)0 null
Any el_val_t any (no compile-time check)

Key consequences:

  • + on two strings is JS + (string concat) — no el_str_concat() runtime call needed for the common case. The runtime DOES export el_str_concat for the cases where codegen does not know the types.
  • == on strings is === — not str_eq(). Same disambiguation logic as the C backend (look at left/right kind, fall back to str_eq for identifiers without int annotation).
  • Map access m["foo"] compiles to JS m["foo"] (no el_get_field). For Field access (m.foo) we emit m["foo"] so it works on plain objects regardless of prototype shape.
  • List access arr[i] is JS arr[i]. No bounds checking — same as C (which segfaults on bad index). Could add el_list_get wrapper later for safe access.
  • EL_NULL becomes JS null, not undefined. The runtime checks for === null consistently. This avoids the JS undefined/null fork and matches El's single null value.

3. Builtin runtime layer (el_runtime.js)

Same function names as el_runtime.c wherever possible, so codegen-js can emit the same call sites. The runtime is a single ES module that exposes every builtin as a named export AND attaches them to a globalThis.__el namespace (so generated code can do either import * as el from './el_runtime.js' or assume globals).

The codegen-js generated output uses the global-namespace style: every emitted file starts with import './el_runtime.js' (which side-effects the globals) so call sites stay flat — println(x) not el.println(x). This matches the C backend's flat call surface and keeps the generated code grep-compatible across targets.

Implemented (~90 builtins)

Category Functions
I/O println, print
String el_str_concat, str_concat, str_eq, str_starts_with, str_ends_with, str_len, int_to_str, str_to_int, str_slice, str_contains, str_replace, str_to_upper, str_to_lower, str_trim, str_index_of, str_split, str_char_at, str_char_code, str_lower, str_upper, str_pad_left, str_pad_right
Math el_abs, el_max, el_min, math_sqrt, math_log, math_ln, math_sin, math_cos, math_pi
Float float_to_str, int_to_float, float_to_int, format_float, decimal_round, str_to_float
List el_list_new, el_list_len, el_list_get, el_list_append, el_list_empty, el_list_clone, list_push, list_push_front, list_join, list_range
Map el_map_new, el_get_field, el_map_get, el_map_set
HTTP http_get, http_post, http_post_json, http_get_with_headers, http_post_with_headers (via fetch(), return Promise<string>)
FS fs_read, fs_write, fs_list (Node-only)
JSON json_parse, json_stringify, json_get, json_get_string, json_get_int, json_get_float, json_get_bool, json_get_raw, json_set, json_array_len
Time time_now, time_now_utc, sleep_secs (Node), sleep_ms
Bool bool_to_str
Process exit_program (Node process.exit)
Refcount el_retain, el_release (no-ops)
Method shortforms append, len, get, map_get, map_set
Native VM aliases native_list_get, native_list_len, native_list_append, native_list_empty, native_list_clone, native_string_chars, native_int_to_str
args / env / state_* Process args, environment, in-memory state
UUID uuid_v4, uuid_new
DOM bridge dom_get_element, dom_get_value, dom_set_value, dom_get_text, dom_set_text, dom_set_prop, dom_get_prop, dom_set_style, dom_add_class, dom_remove_class, dom_show, dom_hide, dom_listen, dom_query, dom_query_all, dom_create, dom_append, dom_remove, dom_is_null (browser-only)
DOM extended dom_set_attr, dom_get_attr, dom_remove_attr, dom_set_html, dom_get_html, dom_get_parent, dom_contains_class, dom_get_checked, dom_set_checked (browser-only)
Timers set_timeout(ms, cb), set_interval(ms, cb) -> Int, clear_interval(handle)
Local storage local_storage_get, local_storage_set, local_storage_remove (browser-only)
Window window_location, window_redirect, window_on_load, window_set, window_get
Debug console_log
Promise helpers (Phase 5) promise_then(p, cb), promise_catch(p, cb), promise_resolve(val), promise_reject(msg)
Object / Array (Phase 5) object_assign(t, s), object_keys(obj), object_values(obj), json_deep_clone(obj), array_from(iterable), type_of(val), instanceof_check(val, name)
native_js escape hatch native_js(code) — eval; native_js_call(obj, method, args) — method call. Use only when no structural alternative exists

Stubbed (throw at runtime)

Every function in this list compiles successfully but throws Error("not supported in JS target — needs server-side delegation: <name>") when called. This is a runtime error, not a compile error, so it doesn't block compilation of code that has dead-code paths through these functions.

  • All dharma_* (membership in DHARMA network requires the daemon)
  • All engram_* (needs the embedded SQLite + activation engine — could be reimplemented in JS later)
  • All llm_* (CORS + API key handling — must go through a server-side proxy)
  • http_serve (browsers don't host servers; Node could, but that's a separate runtime mode)
  • el_cgi_init (CGI identity is a server-side concept)
  • Crypto: sha256_*, hmac_sha256_*, base64* (deferred — can use crypto.subtle later)

Browser-side specific behavior

When running in a browser:

  • println / print map to console.log (no stdout in browsers)
  • http_get / http_post use fetch() (CORS applies)
  • fs_* throws (browsers have no fs)
  • args() returns []
  • env(k) throws (or could read from a global config object — TBD)

When running in Node:

  • println / print map to console.log and process.stdout.write
  • fs_* use node:fs/promises (sync versions for the simple cases)
  • args() returns process.argv.slice(2)
  • env(k) returns process.env[k] ?? null

The runtime auto-detects via typeof window === 'undefined'.


4. Tradeoffs vs the C backend

Concern C backend JS backend
Static types El's Int becomes int64_t, real arithmetic El's Int becomes number — loses precision past 2^53
Linking model Static link against el_runtime.c + libcurl + libpthread ES module import of el_runtime.js
Dynamic dispatch dlsym for http_set_handler / llm_register_tool (requires -rdynamic) JS function value lookup via globalThis[name] — no compiler flag
Tool registry dlsym walks symbol table; tool fns must be top-level C symbols Tool fns live as exports of the generated module; trivially callable
Memory model Refcounted lists/maps with el_retain/el_release to avoid leaks JS GC handles all of it; el_retain/el_release are no-ops
+ overload Has to dispatch in codegen between el_str_concat and integer + because at C level both are int64_t JS + is already overloaded: "a" + "b""ab", 1 + 23. Codegen still preserves the existing dispatch for safety, but the runtime fallback is correct
Concurrency pthread-backed http_serve Single-threaded event loop; http_serve not supported in this target
HTTP client libcurl, blocking, returns body string fetch() is async — see §5
CGI identity el_cgi_init runs at start of main() Not supported; UI code is not a CGI principal
DHARMA / LLM Native, blocking, libcurl-backed Not supported — all such calls throw and the program is expected to delegate to a server-side El daemon via plain HTTP
Compile speed El → C → cc → binary (cc is the slow step) El → JS → done. Faster iteration
Output size Static binary ~2MB Source .js + ~10kb runtime

5. The async problem

fetch() is async. The C backend's http_get(url) is synchronous and returns the body string directly. El source was written assuming sync. Three options:

  1. Pretend it's sync from El's POV; use synchronous XHR (browser) or child_process.execSync('curl ...') (Node). Bad: synchronous XHR is deprecated and frozen on the main thread; execSync is a hack.
  2. Make every http_* builtin in the JS runtime return a Promise, and rewrite codegen-js to insert await everywhere. This requires turning every El function that transitively calls a network builtin into an async fn in JS. Doable, but invasive.
  3. Explicit @async decorator on El functions; codegen-js emits async function + await for known-async call sites. This is the approach implemented.

Decision: option 3, with an explicit opt-in decorator. http_get, http_post, http_post_json, http_get_with_headers, and http_post_with_headers in el_runtime.js return Promise<string>. codegen-js.el now emits await before calls to these builtins and before calls to any El function decorated @async.

How to use async in El (JS target)

Mark a function with @async to declare it as async. Any call to that function from another El function will automatically get await in the generated JS. The callee must also be @async (or call only non-async code) for the pattern to compose correctly.

@async
fn fetch_user(id: String) -> String {
    http_get("https://api.example.com/users/" + id)
}

@async
fn main() -> Void {
    let body = fetch_user("42")
    println(body)
}

Compiles to:

async function fetch_user(id) {
    return await http_get("https://api.example.com/users/" + id);
}

async function main() {
    let body = await fetch_user("42");
    println(body);
}

main();

Limitations:

  • @async is a JS-target-only convention. The C backend ignores the decorator (it calls the synchronous libcurl-backed version).
  • Implicit taint propagation (auto-marking all transitive callers) is not implemented. The programmer must explicitly add @async to every function in the call chain that reaches an async builtin.
  • Forward-reference calls to @async functions are handled correctly: codegen-js does a pre-registration pass over all FnDefs before emitting any code.

For programs that do not touch HTTP, no @async annotation is needed and the generated code is identical to before.


6. Number precision

JS number is IEEE 754 double — only 53 bits of integer precision. El Int is int64_t and the runtime sometimes uses the full 64 bits (e.g. time_now_utc returns nanoseconds-since-epoch, which exceeds 2^53 in practice).

Decision for this scaffold: accept the precision loss. Document it. UI code does not use 64-bit timestamps. If/when a use case demands it, time_now_utc can return a BigInt and we can introduce a BigInt sub-mode. That's a follow-up.


7. Language features — JS target coverage

Fully supported

Feature Notes
cgi {} block Compiled to a no-op + comment (UI code is not a CGI)
service {} block Compiled to a no-op + comment
match expressions LitInt/LitStr/LitBool/Wildcard/Binding/Variant via IIFE if/else chain
type (struct) defs Skipped; structs are plain JS objects. t["field"] works
enum defs Skipped; enum values are strings or ints
? postfix (nil-prop) obj?.field emits (obj)?.["field"] ?? null via JS optional chaining
extern fn Emits a comment; calls resolve to JS environment globals
Anonymous function literals fn(p: T) -> R { body } emits a hoisted function __lambda_N(p)
try/catch Emits try { ... } catch (name) { ... } directly
URL imports import "https://..." emits ES module import (or comment in bundle mode)
Method call on Any obj.method(args) emits obj.method(args) for non-El-shortform methods
Field access on Any obj.field emits obj["field"] (bracket notation, works on prototype chains)
@async decorator async function + await at call sites for async builtins and @async fns

Not supported (stub throws or no-op)

Feature Status Notes
All dharma_* Stub throws Requires server-side daemon
All engram_* Stub throws Could be ported to IndexedDB later
All llm_* Stub throws Route through server
http_serve Stub throws Browsers cannot host servers
el_cgi_init No-op CGI identity is server-side
Capability enforcement Not enforced Runtime stubs throw; compile-time check is a follow-up
VBD role check Not enforced Same
Float bit-cast Not needed JS number is already a double
Crypto primitives Stub throws Add via crypto.subtle later
state_* In-memory only Resets on page reload
args() Node-only Browser returns []
fs_* Node-only Browser throws

7a. Phase 5 constructs — design and emit shapes

extern fn

Declares a function that exists in the JS environment. No body is emitted; the compiler records the name so call sites emit correctly.

extern fn supabase_create_client(url: String, key: String) -> Any

Emits: a comment // extern fn supabase_create_client -- provided by the JS environment.
Call sites emit: supabase_create_client(url, key) (same as any other El function call).

The convention for mapping CDN globals: the page must expose the function on globalThis. For Supabase, the CDN bundle exposes supabase.createClient; a thin adapter assigns globalThis.supabase_create_client = supabase.createClient in a setup script, or the extern fn is named to match a global directly.

Anonymous function literals

fn(params) -> RetType { body } is valid in expression position. Emitted as a hoisted function declaration with a generated name.

dom_listen(btn, "click", fn(event: Any) -> Void {
    handle_click(event)
})

Emits:

function __lambda_1(event) {
  handle_click(event);
}
dom_listen(btn, "click", __lambda_1);

The hoisted-declaration strategy is debuggable, has no closure-capture surprises, and does not require a string-buffer mode in codegen. The generated name appears in stack traces.

try/catch

try {
    let result = risky_call()
} catch (err: Any) {
    show_error(err)
}

Emits JS try { ... } catch (err) { ... } directly. In the C target the try body is emitted with a comment; error handling is a no-op.

Method call on Any-typed values

When a method call's receiver is not a known El runtime shortform (append, len, get, map_get, map_set), the call emits as a direct JS method invocation:

let client: Any = get_client()
let resp = client.auth.signInWithOtp(opts)

Emits:

let client = get_client();
let resp = client["auth"].signInWithOtp(opts);

Field access uses bracket notation (client["auth"]), which works on both plain El map objects and real JS objects with prototype-inherited properties.

URL imports

import "https://cdn.jsdelivr.net/npm/@supabase/supabase-js@2/dist/umd/supabase.js"

In module mode: import "https://..."; at the top of the generated file.
In bundle/IIFE mode: // external: https://... comment.
El source imports (.el files) are excluded -- they were already inlined by resolve_imports.


8. CLI dispatch — --target=js

The compiler entry point compiler.el adds a compile_js(source: String) -> String alongside the existing compile(). The CLI behavior:

elc <source.el> <output>          # default — emit C
elc --target=c <source.el> <out>  # explicit — emit C
elc --target=js <source.el> <out> # emit JS

elc --target=js source.el         # write JS to stdout (no out path)

The argv parser scans for a --target=<lang> token; remaining positional args are <source> and optional <out>. The dispatch logic stays in El: a compile_dispatch(target, source) -> String switch.


8a. Production output — --minify and --obfuscate

Two post-processing flags produce production-ready browser JS in a single compiler invocation, replacing any external post-processing scripts.

Usage

elc --target=js --bundle --minify source.el > output.min.js
elc --target=js --bundle --obfuscate source.el > output.obf.js
elc --target=js --bundle --minify --obfuscate source.el > output.final.js

Both flags require --target=js. Passing either without --target=js prints an error and exits with code 1.

--obfuscate implies --minify — obfuscating unminified code produces no benefit and only increases output size.

Pipeline order

generate JS  ->  (if --bundle, wrap in IIFE)  ->  (if --minify, run terser)  ->  (if --obfuscate, run javascript-obfuscator)  ->  output

Tool discovery

The compiler looks for each tool in this order:

  1. <src_dir>/node_modules/.bin/<tool> — local install next to source file
  2. <src_dir>/../node_modules/.bin/<tool> — one level up (monorepo layout)
  3. npx --yes <tool> — fall back to npx (uses globally cached package or downloads on first use)

If no path resolves and npx is not on PATH, the compiler prints a clear error and exits non-zero:

el-compiler: error: terser not found. Run 'npm install terser' in your project directory.
el-compiler: error: javascript-obfuscator not found. Run 'npm install javascript-obfuscator' in your project directory.

Minification (terser)

Command issued internally:

terser <tmpfile> --compress passes=2,drop_console=false,drop_debugger=true \
  --mangle 'reserved=[<reserved>]' --output <tmpfile.min>

Obfuscation (javascript-obfuscator)

Command issued internally (runs after minification):

javascript-obfuscator <input> --output <output>
  --compact true
  --simplify true
  --string-array true
  --string-array-encoding base64
  --string-array-threshold 0.75
  --identifier-names-generator hexadecimal
  --rename-globals false
  --self-defending false
  --reserved-names <reserved>

Reserved names

These identifiers are protected from renaming by both tools. They are referenced directly from HTML onclick= attributes and other global-scope callsites:

neuronDemoToggle, neuronDemoSend, neuronDemoReset,
signInWith, signInWithEmail, signUpWithEmail, sendMagicLink,
signOut, resetPassword, sendResetEmail, updatePassword,
showSignIn, showSignUp, hideReset,
setSort, addFamilyMember, removeFamilyMember, copyForPlatform, entHeadcountChange,
NEURON_CFG

Temp files

The compiler uses /tmp/elc-<pid>-<timestamp>.js naming for temp files. All temp files are cleaned up on both success and failure paths.

Implementation notes

  • The compiler adds stdout_to_file(path) / stdout_restore() builtins to the C runtime (el_runtime.c) to capture codegen output (which is streamed via println) into a temp file before passing it to the external tools.
  • --minify and --obfuscate error messages are printed after stdout is restored, so they always reach the terminal regardless of output redirection.

9. The path to compiling el-ui/runtime through this backend

This is the real-world test. el-ui/runtime/src/ is currently 5 hand-written .js files. The path to authoring them in El:

  1. Phase 1 — Hello-world. DONE.
  2. Phase 2 — Language coverage. DONE. match, struct/enum field access, ?-propagation, for-over-list, complete operators.
  3. Phase 3 — DOM bridge. DONE. Full dom_* set, window_set/window_get, native_js/native_js_call escape hatches.
  4. Phase 4 — Production output. DONE. --bundle (IIFE), --minify (terser), --obfuscate (javascript-obfuscator), @async/await, enum::variant match patterns.
  5. Phase 5 — Full JS expression coverage. DONE. This is the phase documented in this revision.
    • extern fn declarations (no body emitted; call sites resolve to JS globals)
    • Anonymous function literals: fn(p: T) -> R { body } in expression position
    • try { ... } catch (name: T) { ... } statement
    • Method call on Any-typed values: client.auth.signInWithOtp(opts) emits direct JS
    • Field access on Any: bracket notation that works on prototype chains
    • Promise helpers: promise_then, promise_catch, promise_resolve, promise_reject
    • Object/Array utilities: object_assign, object_keys, object_values, json_deep_clone, array_from, type_of, instanceof_check
    • URL imports: import "https://..." emits ES module import
    • Proof: examples/browser-auth.el -- complete Supabase auth flow with zero native_js or native_js_call
  6. Phase 6 — Port el-ui/runtime/. Translate the 5 JS files to El, compile to JS, swap in. Run el-ui's existing tests. The language is now expressive enough for this.
  7. Phase 7 — Port cgi-studio UI. Larger surface area; same pattern.
  8. Phase 8 — Marketplace plugins. Open the door for third-party UI El.

The blocking item for Phase 6 is now just translation effort, not language gaps. Phase 5 removed the last structural barriers.


10. Test

echo 'fn main() -> Void { println("hello from el-js") }' > /tmp/hello.el
elc --target=js /tmp/hello.el > /tmp/hello.js
node /tmp/hello.js
# → hello from el-js

This should pass after the bootstrap rebuild. See §11.


11. Bootstrap status

Adding --target=js to compile() requires regenerating the shipped elc binary at dist/platform/elc. The rebuild path is:

  1. Existing elc binary compiles updated elc-combined.el (which now includes codegen-js.el and the --target=js dispatch) → elc.c.
  2. cc compiles elc.c → new elc binary.
  3. New elc binary supports --target=js.

The scaffold checks all four scaffold files in. The bootstrap rebuild happens as a follow-up step, gated on review of this design doc.