Compare commits

..

8 Commits

Author SHA1 Message Date
Will Anderson 9d0e1f64d4 fix elb: cp instead of mv for .elh files, preserves headers for downstream modules 2026-05-03 01:19:58 -05:00
Will Anderson e180baf776 fix looks_like_string for empty strings and UTF-8, add cross-module includes in codegen 2026-05-03 00:27:20 -05:00
Will Anderson 3d71db4958 Fix O(n²) string construction in codegen-js, lexer, parser, elb
Replace accumulate-by-concatenation loops with native_list_append + str_join.
Eliminates quadratic memory growth when processing large source files.
This is the v2 compiler state — what produced /tmp/elc-v2.
2026-05-02 22:35:49 -05:00
Will Anderson 2a211992d4 Document bootstrapping path and language architecture 2026-05-02 21:23:22 -05:00
Will Anderson a084feb812 Add separate compilation: extern fn, --emit-header, elb build coordinator 2026-05-02 21:10:44 -05:00
Will Anderson 64e870c207 add El SDK CI/CD pipeline and install script
- .gitea/workflows/sdk-release.yaml: build elc from bootstrap, run tests,
  publish latest release, dispatch el-sdk-updated to downstream repos
- install.sh: one-command El SDK install from Gitea release
2026-05-02 17:45:56 -05:00
Will Anderson 19abc599ec tag self-host snapshot elc.20260502-1916-self-host — gen2==gen3, all tests pass 2026-05-02 14:16:30 -05:00
Will Anderson beddf9acc2 fix: restore self-host fixed point after calendar type additions
elc-combined.el had drifted from el-compiler/src/ across three separate
commits that never synced the bundled flat file:

1. 13948f5 - fold fn main() body into C int main() + _argc/_argv rename
   (codegen.el updated, elc-combined.el not updated)
2. 742bd0b - bare reassignment Assign AST node
   (parser.el + codegen.el updated, elc-combined.el not updated)
3. ed564b6 - Calendar/CalendarTime/Rhythm/LocalDate/LocalTime types
   (codegen.el updated, elc-combined.el not updated)

The drift meant that the elc binary (which embeds the correct logic) could
compile test programs correctly, but a fresh self-host pass using gen2 (built
from the stale elc-combined.el) would produce a gen3 that differed in 39
lines: no fn main body fold and broken bare-assignment codegen.

Fix: regenerate elc-combined.el as a flat concatenation of the current
lexer.el + parser.el + codegen.el + codegen-js.el + compiler.el source
files. Self-host fixed point verified: gen2 == gen3 byte-identical at
6450 lines.

Also rebuild dist/platform/elc and dist/platform/elc.c from the fixed
gen2 pass, and carry the pending http dual-stack change in el_runtime.c.

All tests pass: time (6/6), calendar (10/10), text (8/8), html_sanitizer (29/29).
2026-05-02 14:14:52 -05:00
33 changed files with 23741 additions and 292 deletions
Vendored
BIN
View File
Binary file not shown.
+166
View File
@@ -0,0 +1,166 @@
name: El SDK Release
on:
push:
branches:
- main
jobs:
build-and-release:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install build dependencies
run: |
apt-get update -qq
apt-get install -y gcc libcurl4-openssl-dev
# Gen2: compile the bootstrap C source into a working elc binary
- name: Build elc from bootstrap (gen2)
run: |
gcc -O2 \
-I el-compiler/runtime \
dist/elc-bootstrap.c \
el-compiler/runtime/el_runtime.c \
-lcurl -lpthread \
-o dist/elc-gen2
chmod +x dist/elc-gen2
echo "gen2 elc built"
dist/elc-gen2 --version || true
# Gen3: use gen2 to compile the El compiler from its own El source (self-host)
- name: Self-host: compile El compiler with gen2 (gen3)
run: |
mkdir -p dist/platform
dist/elc-gen2 el-compiler/src/compiler.el > dist/elc-gen3.c
gcc -O2 \
-I el-compiler/runtime \
dist/elc-gen3.c \
el-compiler/runtime/el_runtime.c \
-lcurl -lpthread \
-o dist/platform/elc
chmod +x dist/platform/elc
echo "gen3 (self-hosted) elc built"
dist/platform/elc --version || true
# Run all four test suites with gen3 elc
- name: Run tests — text
run: |
ELC="$(pwd)/dist/platform/elc" \
EL_HOME="$(pwd)" \
bash tests/text/run.sh
- name: Run tests — calendar
run: |
ELC="$(pwd)/dist/platform/elc" \
EL_HOME="$(pwd)" \
bash tests/calendar/run.sh
- name: Run tests — time
run: |
ELC="$(pwd)/dist/platform/elc" \
EL_HOME="$(pwd)" \
bash tests/time/run.sh
- name: Run tests — html_sanitizer
run: |
ELC="$(pwd)/dist/platform/elc" \
EL_HOME="$(pwd)" \
bash tests/html_sanitizer/run.sh
# Publish / update the `latest` release with the three SDK assets
- name: Publish latest release
env:
GITEA_TOKEN: ${{ secrets.GITEA_TOKEN }}
GITEA_API: https://git.neuralplatform.ai/api/v1
REPO: neuron-technologies/el
run: |
# Delete existing `latest` release if it exists
EXISTING_ID=$(curl -sf \
-H "Authorization: token ${GITEA_TOKEN}" \
"${GITEA_API}/repos/${REPO}/releases/tags/latest" \
| python3 -c "import sys,json; d=json.load(sys.stdin); print(d['id'])" 2>/dev/null || true)
if [ -n "${EXISTING_ID}" ]; then
echo "Deleting existing release id=${EXISTING_ID}"
curl -sf -X DELETE \
-H "Authorization: token ${GITEA_TOKEN}" \
"${GITEA_API}/repos/${REPO}/releases/${EXISTING_ID}"
fi
# Delete and re-create the `latest` tag so it points at HEAD
curl -sf -X DELETE \
-H "Authorization: token ${GITEA_TOKEN}" \
"${GITEA_API}/repos/${REPO}/tags/latest" || true
# Create the release
RELEASE_ID=$(curl -sf -X POST \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
"${GITEA_API}/repos/${REPO}/releases" \
-d "{
\"tag_name\": \"latest\",
\"name\": \"El SDK (latest)\",
\"body\": \"Latest El SDK build from commit ${GITHUB_SHA}.\nBuilt $(date -u +%Y-%m-%dT%H:%M:%SZ).\",
\"draft\": false,
\"prerelease\": false
}" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
echo "Created release id=${RELEASE_ID}"
# Upload assets
upload_asset() {
local filepath="$1"
local name="$2"
echo "Uploading ${name}..."
curl -sf -X POST \
-H "Authorization: token ${GITEA_TOKEN}" \
-F "attachment=@${filepath};filename=${name}" \
"${GITEA_API}/repos/${REPO}/releases/${RELEASE_ID}/assets"
}
upload_asset dist/platform/elc elc
upload_asset el-compiler/runtime/el_runtime.c el_runtime.c
upload_asset el-compiler/runtime/el_runtime.h el_runtime.h
echo "Release published successfully"
# Dispatch el-sdk-updated event to downstream repos
- name: Dispatch to foundation/engram
env:
GITEA_TOKEN: ${{ secrets.GITEA_TOKEN }}
GITEA_API: https://git.neuralplatform.ai/api/v1
run: |
curl -sf -X POST \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
"${GITEA_API}/repos/neuron-technologies/engram/dispatches" \
-d "{
\"type\": \"el-sdk-updated\",
\"inputs\": {
\"el_version\": \"latest\",
\"commit\": \"${GITHUB_SHA}\"
}
}"
echo "Dispatched el-sdk-updated to foundation/engram"
- name: Dispatch to neuron-technologies/forge
env:
GITEA_TOKEN: ${{ secrets.GITEA_TOKEN }}
GITEA_API: https://git.neuralplatform.ai/api/v1
run: |
curl -sf -X POST \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
"${GITEA_API}/repos/neuron-technologies/forge/dispatches" \
-d "{
\"type\": \"el-sdk-updated\",
\"inputs\": {
\"el_version\": \"latest\",
\"commit\": \"${GITHUB_SHA}\"
}
}"
echo "Dispatched el-sdk-updated to neuron-technologies/forge"
+979
View File
@@ -0,0 +1,979 @@
# El Language Bootstrap Guide
This document is the authoritative guide for reconstructing the El compiler toolchain from scratch. If the bootstrap binary at `dist/platform/elc` is ever lost, this document is the path back.
---
## 1. The Bootstrap Chain (Current State)
### The Trust Chain
El is a self-hosting language. The compiler is written in El. This creates a circular dependency: you need an El compiler to compile the El compiler. The chain is resolved by a seed binary:
```
dist/platform/elc (Mach-O arm64 native binary)
compiles elc-cli.el
new self-hosted elc binary
compiles itself again (identity check)
stable self-hosted compiler
```
The binary at `dist/platform/elc` is a **Mach-O 64-bit arm64 executable**. The `elc.preselfhost` and `elc.legacy` files in the same directory are older snapshots kept as fallback checkpoints.
The key property: every binary in `dist/platform/` was produced by compiling the El source in `el-compiler/src/` using a previous version of that same binary. The chain is auditable: the source is the ground truth, not the binary.
### The Self-Hosting Pipeline
```
elc-cli.el
imports → el-compiler/src/compiler.el
imports → el-compiler/src/lexer.el
imports → el-compiler/src/parser.el
imports → el-compiler/src/codegen.el
imports → el-compiler/src/codegen-js.el
```
Import resolution is textual. `compiler.el` recursively inlines all imported `.el` files before lex/parse. The result is one large unified source string that the compiler then processes in a single pass.
`elc-combined.el` in the repo root is a pre-merged single-file edition used during early bootstrap iterations.
### What the Bootstrap Binary Actually Is
The `dist/platform/elc` binary is a compiled El program that was produced by running an earlier version of itself on `elc-cli.el`. It is not a Rust binary. The `elc.legacy` and `elc.preselfhost` checkpoints suggest the chain has been continuously self-hosting and re-stamped. The original genesis compiler (referenced in the language spec as a "Rust genesis compiler") was used to produce the first self-hosted binary; that Rust binary is not present in this repo.
To rebuild the current binary from source using the current binary:
```bash
cd /path/to/el
./dist/platform/elc elc-cli.el elc-new.c
cc -std=c11 -I el-compiler/runtime -lcurl -lpthread \
-o dist/platform/elc-new \
elc-new.c el-compiler/runtime/el_runtime.c
```
Verify self-hosting by using `elc-new` to recompile itself and diffing the outputs.
---
## 2. The Language
### 2.1 Lexical Structure
El source is UTF-8. File extension `.el`. Comments are single-line only: `//` to end of line.
**Token representation:** every token is a map `{ "kind": String, "value": String }`.
**Keywords** — from `keyword_kind()` in `lexer.el`:
| Keyword | Token Kind | Notes |
|---------|-----------|-------|
| `let` | `Let` | variable binding |
| `fn` | `Fn` | function definition |
| `type` | `Type` | struct definition |
| `enum` | `Enum` | enum definition |
| `match` | `Match` | pattern match |
| `return` | `Return` | function return |
| `if` | `If` | conditional |
| `else` | `Else` | |
| `for` | `For` | iteration |
| `in` | `In` | used in `for x in list` |
| `while` | `While` | loop |
| `import` | `Import` | module import |
| `from` | `From` | `from mod import { Name }` |
| `as` | `As` | (reserved, no parse form) |
| `with` | `With` | (reserved) |
| `sealed` | `Sealed` | (reserved) |
| `activate` | `Activate` | (reserved) |
| `where` | `Where` | (reserved) |
| `test` | `Test` | (reserved) |
| `seed` | `Seed` | (reserved) |
| `assert` | `Assert` | (reserved) |
| `protocol` | `Protocol` | (reserved) |
| `impl` | `Impl` | (reserved) |
| `retry` | `Retry` | reserved / soft keyword in expr position |
| `times` | `Times` | reserved / soft keyword |
| `fallback` | `Fallback` | reserved / soft keyword |
| `reason` | `Reason` | reserved / soft keyword |
| `parallel` | `Parallel` | reserved / soft keyword |
| `trace` | `Trace` | reserved / soft keyword |
| `requires` | `Requires` | reserved / soft keyword |
| `deploy` | `Deploy` | reserved / soft keyword |
| `to` | `To` | reserved / soft keyword |
| `via` | `Via` | reserved / soft keyword |
| `target` | `Target` | **RESERVED — cannot use as identifier** |
| `true` | `Bool` | literal value `true` |
| `false` | `Bool` | literal value `false` |
| `cgi` | `Cgi` | CGI identity block |
| `service` | `Service` | service declaration block |
| `manager` | `Manager` | VBD role decorator / soft keyword |
| `engine` | `Engine` | VBD role decorator / soft keyword |
| `accessor` | `Accessor` | VBD role decorator / soft keyword |
| `vessel` | `Vessel` | soft keyword |
| `extern` | `Extern` | `extern fn` forward declaration |
**Soft keywords** (`target`, `to`, `via`, `deploy`, `reason`, `times`, `fallback`, `retry`, `parallel`, `trace`, `requires`, `where`, `as`, `with`, `manager`, `engine`, `accessor`, `vessel`): these have dedicated token kinds but the parser re-interprets them as `Ident` nodes when they appear in expression position (e.g., as parameter names or local variable names).
**All token kinds:**
| Kind | Pattern |
|------|---------|
| `Int` | `[0-9]+` |
| `Float` | `[0-9]+ '.' [0-9]+` |
| `Str` | `"…"` with `\"`, `\n`, `\t`, `\r`, `\\` escapes |
| `Bool` | `true` or `false` |
| `Ident` | `[a-zA-Z_][a-zA-Z0-9_]*` (not a keyword) |
| keyword tokens | one per keyword above |
| `Eq` | `=` |
| `EqEq` | `==` |
| `NotEq` | `!=` |
| `Not` | `!` |
| `Lt` / `LtEq` / `Gt` / `GtEq` | `<` `<=` `>` `>=` |
| `And` | `&&` (single `&` is consumed and discarded) |
| `Or` | `\|\|` |
| `Pipe` | `\|` |
| `PipeOp` | `\|>` |
| `Plus` / `Minus` / `Star` / `Slash` | `+` `-` `*` `/` |
| `Percent` | `%` |
| `Arrow` | `->` |
| `FatArrow` | `=>` |
| `Colon` / `ColonColon` | `:` `::` |
| `LParen` / `RParen` | `(` `)` |
| `LBrace` / `RBrace` | `{` `}` |
| `LBracket` / `RBracket` | `[` `]` |
| `Comma` / `Dot` / `Semicolon` | `,` `.` `;` |
| `At` | `@` |
| `QuestionMark` | `?` |
| `Eof` | end-of-input sentinel |
**String comment stripping:** the lexer contains a special heuristic for string literals that embed JavaScript or CSS (`looks_like_code`). If a string contains `<script`, `<style`, or `function` + `;`, the lexer strips `//` and `/* */` comments from the string value before producing the `Str` token. This is a compile-time content sanitization pass.
### 2.2 AST Node Types
Every AST node is a `Map<String, Any>`. The `"expr"` or `"stmt"` key names the node type.
**Expression nodes:**
| `expr` value | Fields | Meaning |
|-------------|--------|---------|
| `Int` | `value: String` | integer literal |
| `Float` | `value: String` | float literal |
| `Str` | `value: String` | string literal |
| `Bool` | `value: String` | `"true"` or `"false"` |
| `Nil` | — | null / missing |
| `Ident` | `name: String` | identifier reference |
| `BinOp` | `op: String`, `left`, `right` | binary operation |
| `Not` | `inner` | unary `!` |
| `Neg` | `inner` | unary `-` |
| `Call` | `func`, `args: [expr]` | function call |
| `Field` | `object`, `field: String` | `obj.field` |
| `Index` | `object`, `index` | `obj[idx]` |
| `Array` | `elems: [expr]` | `[e1, e2, …]` |
| `Map` | `pairs: [{ key: String, value: expr }]` | `{ "k": v, … }` |
| `If` | `cond`, `then: [stmt]`, `else: [stmt]`, `has_else: Bool` | conditional expression |
| `For` | `item: String`, `list`, `body: [stmt]` | for-in expression |
| `Match` | `subject`, `arms: [{ pattern, body }]` | pattern match |
| `DurationLit` | `count: String`, `unit: String` | `30.seconds`, `1.hour` |
| `Try` | `inner` | postfix `?` (no-op passthrough today) |
**Binary operators** (`op` field values): `Plus`, `Minus`, `Star`, `Slash`, `EqEq`, `NotEq`, `Lt`, `Gt`, `LtEq`, `GtEq`, `And`, `Or`.
**Operator precedence** (higher = tighter binding):
| Level | Operators |
|-------|-----------|
| 6 | `Star`, `Slash` |
| 5 | `Plus`, `Minus` |
| 4 | `Lt`, `Gt`, `LtEq`, `GtEq` |
| 3 | `EqEq`, `NotEq` |
| 2 | `And` |
| 1 | `Or` |
**Pattern nodes** (used inside `Match` arms):
| `pattern` value | Fields | Meaning |
|----------------|--------|---------|
| `Wildcard` | — | `_` — always matches |
| `Binding` | `name: String` | binds subject to name |
| `LitInt` | `value: String` | integer literal pattern |
| `LitStr` | `value: String` | string literal pattern |
| `LitBool` | `value: String` | boolean literal pattern |
**Statement nodes:**
| `stmt` value | Fields | Meaning |
|-------------|--------|---------|
| `Let` | `name: String`, `value: expr`, `type: String` | variable binding |
| `Assign` | `name: String`, `value: expr` | bare reassignment `name = expr` |
| `Return` | `value: expr` | return statement |
| `While` | `cond: expr`, `body: [stmt]` | while loop |
| `For` | `item: String`, `list: expr`, `body: [stmt]` | for-in loop |
| `FnDef` | `name: String`, `params: [param]`, `body: [stmt]`, `ret_type: String`, `decorator?: String` | function definition |
| `ExternFn` | `name: String`, `params: [param]`, `ret_type: String` | forward declaration |
| `TypeDef` | `name: String`, `fields: [{ name: String }]` | struct type definition |
| `EnumDef` | `name: String`, `variants: [{ name: String }]` | enum definition |
| `Import` | `path: String` | `import "file.el"` or `from mod import { … }` |
| `CgiBlock` | `name`, `dharma_id`, `principal`, `network`, `engram`, `has_*: Bool` | CGI identity declaration |
| `ServiceBlock` | `name`, `sponsor`, `domain` | service declaration |
| `Expr` | `value: expr` | bare expression statement |
**Param nodes:** `{ "name": String, "type": String }` where `type` is the leading identifier of the type annotation (e.g., `"Int"`, `"String"`, `"Map"`) or `""` if unannotated.
### 2.3 The Type System
Type annotations are parsed and stored but not type-checked at compile time. They serve as documentation and as hints to the codegen for arithmetic dispatch.
**Built-in types:**
| Type | C representation | Notes |
|------|-----------------|-------|
| `String` | `const char*` cast to `el_val_t` | via `EL_STR()` macro |
| `Int` | `int64_t` | direct |
| `Bool` | `int64_t` | `0` = false, nonzero = true |
| `Float` | `int64_t` | bit-cast double via `el_from_float()` |
| `Void` | `void` | functions returning nothing |
| `Any` | `void*` cast to `el_val_t` | generic containers |
| `[T]` | `el_val_t` | pointer to ElList struct |
| `Map<K,V>` | `el_val_t` | pointer to ElMap struct |
**Temporal types** (first-class in codegen):
| Type | Representation | Notes |
|------|---------------|-------|
| `Instant` | nanoseconds since Unix epoch as `int64_t` | `now()` returns this |
| `Duration` | signed nanoseconds as `int64_t` | `30.seconds` = `30 * 1000000000` |
| `Calendar` | pointer to heap-allocated struct | `earth_calendar(zone)` |
| `CalendarTime` | pointer to heap-allocated struct | `now_in(cal)` |
| `LocalDate` | pointer to heap-allocated struct | `local_date(y, m, d)` |
| `LocalTime` | nanoseconds since midnight, direct `int64_t` | `local_time(h, m, s, ns)` |
| `Zone` | pointer to heap-allocated struct | `zone("America/New_York")` |
| `Rhythm` | pointer to heap-allocated struct | recurrence pattern |
The codegen tracks type-annotated variable names in per-function process state (`__int_names`, `__instant_names`, `__duration_names`, etc.) to dispatch arithmetic and comparisons through the correct runtime wrappers. Type-mismatched operations (e.g., `Instant + Instant`) are emitted as `#error` directives.
**Duration postfix literals:** `30.seconds`, `1.hour`, `500.millis`, `30.nanos` are parsed as `DurationLit` AST nodes and compiled to `el_duration_from_nanos(count * multiplier)`. The multipliers:
| Unit | Nanoseconds |
|------|------------|
| `nano` / `nanos` | 1 |
| `milli` / `millis` / `millisecond` / `milliseconds` | 1,000,000 |
| `second` / `seconds` | 1,000,000,000 |
| `minute` / `minutes` | 60,000,000,000 |
| `hour` / `hours` | 3,600,000,000,000 |
| `day` / `days` | 86,400,000,000,000 |
### 2.4 Key Language Semantics
**Implicit return.** The final expression in a function body becomes the return value if it is not a control-flow construct (`If`, `For`). The codegen's `transform_implicit_return` rewrites the last `Expr` statement into a `Return` statement before emitting.
**Let-rebinding, not mutation.** El uses `let` for both initial binding and rebinding:
```el
let count = 0
let count = count + 1 // NOT mutation creates a new binding in the same scope
```
The codegen tracks declared names per C scope. When `count` is already in `declared`, it emits `count = count + 1;` (plain assignment). When it is new, it emits `el_val_t count = 0;`. This means **El does not have mutable variables in the traditional sense** — every `let` is a potential redeclaration. The practical effect is that shadowing and in-place update use identical syntax.
**Bare reassignment.** The parser also handles `name = expr` (without `let`) when an `Ident` is immediately followed by `Eq`. This emits a plain C assignment.
**`target` is reserved.** The word `target` is lexed as the `Target` token kind — it cannot be used as a variable or parameter name. Use `tgt` or another name instead. This is a live gotcha in `compiler.el` itself, which uses `tgt` for exactly this reason.
**`__no_block_expr` guard.** The parser uses process state key `__no_block_expr` to suppress Map-literal parsing when parsing the condition of `if`, `while`, `for`, and `match`. This prevents a stray `{` (the start of the then-block) from being parsed as a Map literal.
**Arena memory model.** The runtime includes an arena allocator that is activated in server/long-running contexts. In CLI mode (`elc`, `elb`) the arena is inactive. Memory is managed via ARC (reference counting): `el_retain()` and `el_release()` on Lists and Maps. Strings and ints are not refcounted — the retain/release functions are safe no-ops on non-tagged values.
---
## 3. The Runtime API
All runtime functions are declared in `el-compiler/runtime/el_runtime.h`. Every compiled El program links against `el-compiler/runtime/el_runtime.c`.
All values are `el_val_t` (`int64_t`). Strings are pointers cast through `int64_t` using `EL_STR(s)` / `EL_CSTR(v)` macros.
Canonical compile command:
```bash
cc -std=c11 -I el-compiler/runtime -lcurl -lpthread \
-o <out> <prog>.c el-compiler/runtime/el_runtime.c
```
### I/O
| Function | Signature | Description |
|----------|-----------|-------------|
| `println` | `(s) -> Void` | print string + newline to stdout |
| `print` | `(s) -> Void` | print string without newline |
| `readline` | `() -> String` | read one line from stdin |
### String Operations
| Function | Signature | Description |
|----------|-----------|-------------|
| `el_str_concat` | `(a, b) -> String` | concatenate two strings |
| `str_concat` | `(a, b) -> String` | alias for `el_str_concat` |
| `str_eq` | `(a, b) -> Bool` | string equality comparison |
| `str_starts_with` | `(s, prefix) -> Bool` | prefix test |
| `str_ends_with` | `(s, suffix) -> Bool` | suffix test |
| `str_contains` | `(s, sub) -> Bool` | substring test |
| `str_len` | `(s) -> Int` | byte length |
| `str_slice` | `(s, start, end) -> String` | substring (byte offsets) |
| `str_replace` | `(s, from, to) -> String` | replace all occurrences |
| `str_to_upper` / `str_upper` | `(s) -> String` | uppercase |
| `str_to_lower` / `str_lower` | `(s) -> String` | lowercase |
| `str_trim` | `(s) -> String` | strip leading/trailing whitespace |
| `str_lstrip` / `str_rstrip` | `(s) -> String` | one-sided strip |
| `str_index_of` | `(s, sub) -> Int` | position of substring; `-1` if absent |
| `str_last_index_of` | `(s, sub) -> Int` | last position |
| `str_index_of_all` | `(s, sub) -> [Int]` | all byte offsets (non-overlapping) |
| `str_find_chars` | `(s, any_of) -> Int` | first index of any char in set |
| `str_split` | `(s, sep) -> [String]` | split on separator |
| `str_split_lines` | `(s) -> [String]` | split on newlines |
| `str_split_chars` | `(s) -> [String]` | split into individual characters |
| `str_split_n` | `(s, sep, n) -> [String]` | split at most `n` times |
| `str_join` | `(list, sep) -> String` | join list with separator |
| `str_char_at` | `(s, i) -> String` | character at byte index |
| `str_char_code` | `(s, i) -> Int` | Unicode code point at index |
| `str_pad_left` | `(s, width, pad) -> String` | left-pad to width |
| `str_pad_right` | `(s, width, pad) -> String` | right-pad to width |
| `str_format` | `(fmt, data) -> String` | `{key}` interpolation |
| `str_repeat` | `(s, n) -> String` | repeat string n times |
| `str_reverse` | `(s) -> String` | reverse by codepoint |
| `str_strip_prefix` | `(s, prefix) -> String` | remove prefix if present |
| `str_strip_suffix` | `(s, suffix) -> String` | remove suffix if present |
| `str_strip_chars` | `(s, chars) -> String` | strip characters from both ends |
| `str_count` | `(s, sub) -> Int` | count non-overlapping occurrences |
| `str_count_chars` | `(s) -> Int` | codepoint count |
| `str_count_bytes` | `(s) -> Int` | alias for `str_len` |
| `str_count_lines` | `(s) -> Int` | line count |
| `str_count_words` | `(s) -> Int` | word count |
| `str_count_letters` | `(s) -> Int` | ASCII letter count |
| `str_count_digits` | `(s) -> Int` | ASCII digit count |
| `is_letter` / `is_digit` / `is_alphanumeric` | `(s) -> Bool` | ASCII char classification |
| `is_whitespace` / `is_punctuation` | `(s) -> Bool` | |
| `is_uppercase` / `is_lowercase` | `(s) -> Bool` | |
| `int_to_str` | `(n) -> String` | format integer |
| `str_to_int` | `(s) -> Int` | parse integer |
| `str_to_float` | `(s) -> Float` | parse float |
| `parse_int` | `(s, default) -> Int` | parse with fallback |
| `bool_to_str` | `(b) -> String` | format bool |
### Integer/Float Math
| Function | Description |
|----------|-------------|
| `el_abs(n)` | absolute value |
| `el_max(a, b)` | maximum |
| `el_min(a, b)` | minimum |
| `float_to_str(f)` | format float as string |
| `int_to_float(n)` | widen Int to Float |
| `float_to_int(f)` | truncate Float to Int |
| `format_float(f, decimals)` | format with N decimal places |
| `decimal_round(f, decimals)` | round to N decimals |
| `math_sqrt(f)` | square root |
| `math_log(f)` / `math_ln(f)` | logarithms |
| `math_sin(f)` / `math_cos(f)` / `math_pi()` | trigonometry |
### List Operations
| Function | Description |
|----------|-------------|
| `el_list_empty()` | create empty list |
| `el_list_new(count, …)` | create list from N values (varargs) |
| `el_list_len(list)` | length |
| `el_list_get(list, i)` | element at index; `0` on out-of-bounds |
| `el_list_append(list, e)` | append; returns updated list |
| `el_list_clone(list)` | shallow copy |
| `list_push(list, e)` | alias for `el_list_append` |
| `list_push_front(list, e)` | prepend |
| `list_join(list, sep)` | join to string |
| `list_range(start, end)` | integer range `[start, end)` |
| `native_list_empty()` | alias for `el_list_empty` (used in compiler source) |
| `native_list_append(l, v)` | alias for `el_list_append` |
| `native_list_get(l, idx)` | alias for `el_list_get` |
| `native_list_len(l)` | alias for `el_list_len` |
| `native_list_clone(l)` | alias for `el_list_clone` |
| `append(l, e)` | method-call alias: `list.append(e)` |
| `len(l)` | method-call alias: `list.len()` |
| `get(l, i)` | method-call alias: `list.get(i)` |
### Map Operations
| Function | Description |
|----------|-------------|
| `el_map_new(count, …)` | create map from key/value pairs (varargs) |
| `el_map_get(map, key)` | get value by key |
| `el_map_set(map, key, value)` | set key; returns map |
| `el_get_field(map, key)` | alias; emitted for `.field` access |
| `map_get(map, key)` | method-call alias |
| `map_set(map, key, value)` | method-call alias |
### ARC (Reference Counting)
| Function | Description |
|----------|-------------|
| `el_retain(v)` | increment refcount; no-op for non-heap values |
| `el_release(v)` | decrement refcount; free when zero |
### In-Process State
| Function | Description |
|----------|-------------|
| `state_set(key, value)` | store in process-global key/value table |
| `state_get(key)` | retrieve; `""` if absent |
| `state_del(key)` | delete key |
| `state_keys()` | all keys as `[String]` |
### Filesystem
| Function | Description |
|----------|-------------|
| `fs_read(path)` | read file to string; `""` on error |
| `fs_write(path, content)` | write string; returns `1` on success |
| `fs_write_bytes(path, bytes, length)` | write raw bytes of known length |
| `fs_list(path)` | list directory entries |
| `fs_exists(path)` | check if path exists |
| `fs_mkdir(path)` | mkdir -p |
### HTTP Client
| Function | Description |
|----------|-------------|
| `http_get(url)` | GET; returns body string |
| `http_post(url, body)` | POST; returns body string |
| `http_post_json(url, json_body)` | POST with Content-Type: application/json |
| `http_get_with_headers(url, headers_map)` | GET with custom headers |
| `http_post_with_headers(url, body, headers_map)` | POST with custom headers |
| `http_post_form_auth(url, form_body, auth_header)` | POST with auth |
| `http_delete(url)` | DELETE |
| `http_get_to_file(url, headers_map, output_path)` | stream response to file |
| `http_post_to_file(url, body, headers_map, output_path)` | stream POST response to file |
| `http_response(status, headers_json, body)` | build response envelope |
| `url_encode(s)` | RFC 3986 percent-encoding |
| `url_decode(s)` | URL decode |
| `el_html_sanitize(html, allowlist_json)` | allowlist HTML sanitizer |
### HTTP Server
| Function | Description |
|----------|-------------|
| `http_serve(port, handler)` | start server; handler: `(method, path, body) -> String` |
| `http_serve_v2(port, handler)` | start server; handler: `(method, path, headers_map, body) -> String` |
| `http_set_handler(name)` | set handler by symbol name |
| `http_set_handler_v2(name)` | v2 variant |
### JSON
| Function | Description |
|----------|-------------|
| `json_get(json, key)` | substring lookup of `"key": value` |
| `json_parse(s)` | parse JSON string to List/Map |
| `json_stringify(v)` | serialize Any to JSON string |
| `json_get_string(j, key)` | typed extract: String |
| `json_get_int(j, key)` | typed extract: Int |
| `json_get_float(j, key)` | typed extract: Float |
| `json_get_bool(j, key)` | typed extract: Bool |
| `json_get_raw(j, key)` | extract nested object/array as JSON string |
| `json_set(j, key, value)` | update field, return new JSON string |
| `json_array_len(j)` | length of JSON array string |
| `json_array_get(j, index)` | element at index |
| `json_array_get_string(j, index)` | string element at index |
### Time (Epoch-Based)
| Function | Description |
|----------|-------------|
| `time_now()` | Unix epoch milliseconds |
| `time_now_utc()` | same, explicit UTC |
| `time_format(ts, fmt)` | format timestamp |
| `time_to_parts(ts)` | decompose to Map of fields |
| `time_from_parts(secs, ns, tz)` | construct timestamp |
| `time_add(ts, n, unit)` | add duration |
| `time_diff(ts1, ts2, unit)` | difference |
| `unix_timestamp()` | Unix seconds as Int |
| `sleep_secs(secs)` | sleep N seconds |
| `sleep_ms(ms)` | sleep N milliseconds |
### Time (First-Class Instant/Duration)
| Function | Description |
|----------|-------------|
| `now()` / `el_now_instant()` | current time as Instant (nanoseconds) |
| `unix_seconds(n)` | construct Instant from Unix seconds |
| `unix_millis(n)` | construct Instant from Unix milliseconds |
| `instant_from_iso8601(s)` | parse ISO 8601 string |
| `instant_to_unix_seconds(i)` | extract Unix seconds |
| `instant_to_unix_millis(i)` | extract Unix milliseconds |
| `instant_to_iso8601(i)` | format as ISO 8601 |
| `el_duration_from_nanos(ns)` | construct Duration from nanoseconds |
| `duration_seconds(n)` | Duration from seconds |
| `duration_millis(n)` | Duration from milliseconds |
| `duration_nanos(n)` | Duration from nanoseconds |
| `duration_to_seconds(d)` | extract seconds |
| `duration_to_millis(d)` | extract milliseconds |
| `duration_to_nanos(d)` | extract nanoseconds |
| `el_instant_add_dur(inst, dur)` | Instant + Duration |
| `el_instant_sub_dur(inst, dur)` | Instant - Duration |
| `el_instant_diff(a, b)` | Instant - Instant = Duration |
| `el_duration_add/sub/scale/div` | Duration arithmetic |
| `el_instant_lt/le/gt/ge/eq/ne` | Instant comparison |
| `el_duration_lt/le/gt/ge/eq/ne` | Duration comparison |
| `el_sleep_duration(dur)` | sleep for a Duration |
| `ttl_cache_set(key, value)` | store with TTL |
| `ttl_cache_get(key, max_age)` | retrieve if within max_age |
| `ttl_cache_age(key)` | age of cached value as Duration |
### Calendar System
| Function | Description |
|----------|-------------|
| `zone(id)` | IANA zone or fixed offset |
| `zone_utc()` / `zone_local()` | UTC and local zone |
| `zone_offset(hours, minutes)` | fixed offset zone |
| `earth_calendar(z)` | Gregorian calendar in zone |
| `earth_calendar_default()` | system default |
| `mars_calendar()` / `cycle_calendar(period)` | non-Earth calendars |
| `no_cycle_calendar()` / `relative_calendar(epoch)` | abstract calendars |
| `now_in(cal)` | current time as CalendarTime |
| `in_calendar(inst, cal)` | project Instant into Calendar |
| `cal_format(ct, pattern)` | format CalendarTime |
| `cal_to_instant(ct)` | extract underlying Instant |
| `cal_cycle_phase(ct)` / `cal_in(ct, cal)` | calendar ops |
| `local_date(y, m, d)` | construct LocalDate |
| `local_time(h, m, s, ns)` | construct LocalTime |
| `local_datetime(date, time)` | construct LocalDateTime |
| `zoned(date, time, cal)` | zoned datetime |
| `local_date_year/month/day` | LocalDate accessors |
| `local_time_hour/minute/second/nanos` | LocalTime accessors |
| `el_local_date_add_dur` / `el_local_time_add_dur` | date/time arithmetic |
| `el_local_date_lt` / `el_local_date_eq` | date comparison |
| `rhythm_*` | recurrence patterns (cycle_start, weekday, weekly_at, next_after, matches, …) |
### Process / Execution
| Function | Description |
|----------|-------------|
| `args()` | command-line arguments as `[String]` (excludes argv[0]) |
| `env(key)` | read environment variable; `""` if unset |
| `exit(code)` | exit process with code |
| `exit_program(code)` | alias for `exit` |
| `getpid_now()` | current process ID |
| `exec_command(cmd)` | run shell command; return exit code |
| `exec_capture(cmd)` | run shell command; capture and return stdout |
| `uuid_new()` / `uuid_v4()` | generate UUID v4 |
| `native_int_to_str(n)` | format integer (alias, used in compiler source) |
| `native_string_chars(s)` | split string into `[String]` of single characters |
### Crypto
| Function | Description |
|----------|-------------|
| `sha256_hex(input)` | SHA-256, hex output |
| `sha256_bytes(input)` | SHA-256, raw bytes |
| `hmac_sha256_hex(key, msg)` | HMAC-SHA-256, hex |
| `hmac_sha256_bytes(key, msg)` | HMAC-SHA-256, raw bytes |
| `base64_encode(input)` / `base64_decode(input)` | standard base64 |
| `base64url_encode(input)` / `base64url_decode(input)` | URL-safe base64 |
| `sha3_256_hex(input)` | SHA3-256 (Keccak) |
| `pq_keygen_signature()` | Dilithium-3 key pair |
| `pq_sign(sk_hex, msg)` / `pq_verify(pk_hex, msg, sig_hex)` | PQ signatures |
| `pq_kem_keygen()` / `pq_kem_encaps(pk)` / `pq_kem_decaps(sk, ct)` | Kyber-768 KEM |
| `pq_hybrid_keygen()` / `pq_hybrid_handshake(remote_pub)` | X25519 + Kyber hybrid |
| `aead_encrypt(key_hex, plaintext)` | AES-256-GCM encrypt |
| `aead_decrypt(key_hex, nonce_hex, ct_hex)` | AES-256-GCM decrypt |
### DHARMA Network (CGI programs only)
| Function | Description |
|----------|-------------|
| `el_cgi_init(name, dharma_id, principal, network, engram)` | initialize CGI identity (called by generated `main()`) |
| `dharma_connect(cgi_id)` | open channel to peer |
| `dharma_send(channel, content)` | send message; blocks for response |
| `dharma_activate(query)` | spreading activation across DHARMA network |
| `dharma_emit(event_type, payload)` | emit network event (@manager only) |
| `dharma_field(event_type)` | wait for event (@manager only) |
| `dharma_strengthen(cgi_id, weight)` | Hebbian potentiation |
| `dharma_relationship(cgi_id)` | current relationship weight |
| `dharma_peers()` | all connected peers sorted by weight |
### Engram Knowledge Graph
| Function | Description |
|----------|-------------|
| `engram_node(content, type, salience)` | create node; returns ID |
| `engram_node_full(content, type, label, salience, importance, confidence, tier, tags)` | full node creation |
| `engram_node_layered(…, layer_id)` | create node in specific layer |
| `engram_get_node(id)` | retrieve node by ID |
| `engram_strengthen(node_id)` | Hebbian potentiation |
| `engram_forget(node_id)` | delete node and edges |
| `engram_node_count()` | total node count |
| `engram_edge_count()` | total edge count |
| `engram_search(query, limit)` | full-text search |
| `engram_scan_nodes(limit, offset)` | paginated node scan |
| `engram_connect(from, to, weight, relation)` | create directed edge |
| `engram_edge_between(from, to)` | get edge |
| `engram_neighbors(node_id)` | BFS neighbors |
| `engram_neighbors_filtered(node_id, max_depth, direction)` | filtered BFS |
| `engram_activate(query, depth)` | spreading activation |
| `engram_save(path)` / `engram_load(path)` | snapshot to/from disk |
| `engram_add_layer(name, priority, suppressible, transparent, injectable)` | add consciousness layer |
| `engram_remove_layer(layer_id)` / `engram_list_layers()` | layer management |
| `engram_*_json` variants | JSON-string versions of search/scan/activate |
| `engram_compile_layered_json(intent, depth)` | prompt-ready context block |
### LLM (Anthropic API)
| Function | Description |
|----------|-------------|
| `llm_call(model, prompt)` | single-turn call |
| `llm_call_system(model, system, user)` | call with system prompt |
| `llm_call_agentic(model, system, user, tools)` | agentic call with tools (CGI only) |
| `llm_vision(model, system, prompt, image)` | vision call |
| `llm_models()` | list available models |
| `llm_register_tool(name, handler_fn_name)` | register tool handler (CGI only) |
### Observability
| Function | Description |
|----------|-------------|
| `emit_log(level, msg, fields_json)` | emit OTLP log |
| `emit_metric(name, value, tags_json)` | emit OTLP metric |
| `trace_span_start(name)` | start trace span |
| `trace_span_end(span_handle)` | end trace span |
| `emit_event(name, duration_ms)` | emit event |
---
## 4. How to Re-Bootstrap from Zero
This section assumes the bootstrap binary is gone. Everything else (source files, runtime) is intact.
### What You Need to Implement
A minimal El compiler has three parts: lexer, parser, codegen. Each can be written in any language. The goal is to compile `elc-cli.el` into a working `elc` binary, after which El is self-hosting again.
### Step 1: Write a Minimal Lexer
The lexer must produce a list of `{ "kind": String, "value": String }` maps (or equivalent structures). Required token kinds: `Int`, `Float`, `Str`, `Bool`, `Ident`, `Eof`, and all keywords and operators listed in section 2.1.
The minimal subset needed to compile the compiler itself:
- Keywords: `let`, `fn`, `return`, `if`, `else`, `while`, `for`, `in`, `import`, `from`, `true`, `false`, `extern`
- Literals: `Int`, `Str`, `Bool`, `Ident`
- Operators: `=`, `==`, `!=`, `!`, `<`, `>`, `<=`, `>=`, `&&`, `||`, `+`, `-`, `*`, `/`, `->`, `=>`, `:`, `,`, `.`, `(`, `)`, `{`, `}`, `[`, `]`, `@`, `?`
- Special: `Eof`
The lexer in `lexer.el` walks a char array using `native_list_get` to avoid O(n²) string slicing. A Python implementation can use a simple index into a string. Escapes to handle: `\"`, `\n`, `\t`, `\r`, `\\`.
### Step 2: Write a Minimal Parser
The parser is a standard recursive descent parser. It produces AST maps as described in section 2.2.
The minimal statement forms needed to compile the compiler:
- `let name [: Type] = expr`
- `fn name(params) [-> Type] { body }`
- `extern fn name(params) [-> Type]`
- `return expr`
- `while cond { body }`
- `for item in list { body }`
- `if cond { body } [else [if] { body }]`
- `import "path"`
- `from module import { … }`
- `@decorator stmt`
- `name = expr` (bare assignment)
- bare expression statement
The minimal expression forms:
- Integer, float, string, bool literals
- Identifier
- Binary operations with the precedence table from section 2.2
- Unary `!` and `-`
- Function call: `f(a, b, …)`
- Method call: `obj.method(args)` (parsed as Call with Field func)
- Field access: `obj.field`
- Index access: `obj[i]`
- Array literal: `[e1, e2, …]`
- Map literal: `{ "key": value, … }`
- `if` as expression
- `match` expression
- Postfix `?` (can be a no-op)
- Duration literal: `N.unit`
The `__no_block_expr` guard (section 2.4) is important: without it, `if a || b { ... }` will incorrectly parse `{` as a Map literal.
### Step 3: Write a Minimal Codegen
The codegen emits C11 source. Required output structure:
```c
#include <stdint.h>
#include <stdlib.h>
#include "el_runtime.h"
// Forward declarations for all non-main functions
el_val_t fn_name(el_val_t p1, el_val_t p2);
...
// File-scope let bindings (if any)
el_val_t GLOBAL_NAME;
// Function bodies
el_val_t fn_name(el_val_t p1, el_val_t p2) {
...
return 0;
}
// Entry point
int main(int _argc, char** _argv) {
el_runtime_init_args(_argc, _argv);
...
return 0;
}
```
Critical codegen rules:
1. **All values are `el_val_t`**. Every parameter, local variable, and return type is `el_val_t` unless the function has `ret_type == "Void"` (use `void`).
2. **Let-rebinding**: track declared names per C scope. Emit `el_val_t name = val;` on first occurrence; emit `name = val;` on subsequent occurrences of the same name in the same scope.
3. **`+` dispatch**: if either operand is a string literal → `el_str_concat(a, b)`. If both are provably integers → `(a + b)`. Default fallback → `el_str_concat`.
4. **`==` dispatch**: if either operand is a string or identifier → `str_eq(a, b)`. If both are integer literals or provably Int → `(a == b)`.
5. **String literals**: wrap in `EL_STR("…")` and escape: `\"``\\\"`, `\n``\\n`, `\t``\\t`, `\\``\\\\`.
6. **Map literals**: `el_map_new(N, "k1", v1, "k2", v2, …)`. Empty map: `el_map_new(0)`.
7. **Array literals**: `el_list_new(N, e1, e2, …)`. Empty: `el_list_empty()`.
8. **Index access**: string-literal index → `el_get_field(obj, EL_STR("key"))`. Integer index → `el_list_get(obj, idx)`.
9. **Field access** `obj.field``el_get_field(obj, EL_STR("field"))`.
10. **Method call** `obj.method(args)``method(obj, args)`.
11. **`for item in list`** → emit:
```c
{ el_val_t _el_lst = <list>; el_val_t _el_len = el_list_len(_el_lst);
for (el_val_t _el_i = 0; _el_i < _el_len; _el_i++) {
el_val_t item = el_list_get(_el_lst, _el_i);
<body>
}
}
```
12. **`match`** → GCC/Clang statement expression with `goto`:
```c
({ el_val_t _s = <subject>; el_val_t _r = 0;
if (_s == 42) { _r = <arm_body>; goto _done; }
if (str_eq(_s, EL_STR("str"))) { _r = <arm_body>; goto _done; }
{ _r = <wildcard_body>; goto _done; }
_done:; _r; })
```
13. **`if` as expression** → similarly wrapped in a GCC/Clang statement expression.
14. **Implicit return**: if the last statement in a function body is a bare `Expr` (not `If` or `For`), emit it as `return <expr>;` instead of `<expr>;`.
15. **Float literals**: emit as `el_from_float(<value>)`.
16. **Bool literals**: `true` → `1`, `false` → `0`.
17. **`fn main()`**: do not emit as a regular `el_val_t` function. Instead, fold its body into C's `int main()` after any top-level statements.
18. **`extern fn`**: emit only a forward declaration (no body).
19. **Forward declarations**: scan for all `FnDef` nodes before emitting bodies. This enables mutual recursion.
### Step 4: Compile the El Compiler
Using your minimal implementation, compile `elc-cli.el` (which imports the entire compiler chain):
```bash
# Your minimal compiler
python3 minimal_elc.py elc-cli.el > elc-new.c
# Build with the runtime
cc -std=c11 -I el-compiler/runtime -lcurl -lpthread \
-o elc-new elc-new.c el-compiler/runtime/el_runtime.c
```
### Step 5: Verify Self-Hosting
```bash
# Compile elc-cli.el with the new compiler
./elc-new elc-cli.el elc-v2.c
cc -std=c11 -I el-compiler/runtime -lcurl -lpthread \
-o elc-v2 elc-v2.c el-compiler/runtime/el_runtime.c
# Compile again with the second-generation compiler
./elc-v2 elc-cli.el elc-v3.c
# The outputs should be identical
diff elc-v2.c elc-v3.c
```
A clean diff confirms you have a stable fixed point: the compiler reproduces itself exactly.
### Step 6: Replace the Bootstrap Binary
```bash
cp elc-v2 dist/platform/elc
```
You are bootstrapped.
### Minimal El Subset for the Compiler Itself
The El compiler source (`lexer.el`, `parser.el`, `codegen.el`, `compiler.el`) uses:
- `fn`, `let`, `while`, `if`/`else`, `return`, `for`/`in`, `import`
- `extern fn` (for `.elh` headers)
- `String`, `Int`, `Bool`, `Void`, `Any`, `Map<String, Any>`, `[String]`, `[Map<String, Any>]`
- Map literals `{ "key": val }`
- Array literals `[...]` (and `native_list_empty()`)
- List operations: `native_list_empty()`, `native_list_append()`, `native_list_get()`, `native_list_len()`, `native_list_clone()`
- String operations: `str_join()`, `str_eq()`, `str_contains()`, `str_starts_with()`, `str_slice()`, `str_trim()`, `str_split()`, `str_index_of()`, `str_len()`, `str_to_int()`, `native_string_chars()`, `native_int_to_str()`
- `state_get()`, `state_set()`
- `println()`, `fs_read()`, `fs_write()`, `exit()`
- `el_release()` (ARC cleanup)
The compiler does not use: HTTP, engram, dharma, LLM, crypto, UUID, float arithmetic.
---
## 5. The Long-Term Solution: elvm
### Why a VM Makes Bootstrapping More Auditable
The current bootstrap chain relies on trusting a binary whose source we cannot fully audit by inspection alone. This is the classic "trusting trust" problem (Ken Thompson, 1984). A virtual machine breaks the chain:
- `elc` targets `elvm` bytecode (instead of C)
- `elvm` is a minimal interpreter hand-written in ~500 lines of C
- The hand-written C is small enough to audit completely
- Anyone can compile `elvm.c` with any C compiler
- From there: `elvm` interprets `elc.elvm` → `elc` compiles El → `cc` builds native binaries
The benefit: the trusted base shrinks from "a Mach-O binary" to "500 lines of straightforward C code that anyone can read in an afternoon."
### The elvm Design
A minimal elvm needs:
- A stack or register machine (stack is simpler)
- Instructions: push, pop, add, sub, mul, div, cmp, jump, call, return, load, store
- A string table (El strings are mostly literals)
- A heap for ElList and ElMap
- An FFI table mapping El runtime builtins to C functions
The El compiler would gain a `--target=elvm` flag in `compile_dispatch()`. Codegen would emit bytecode instead of C text. The runtime interface stays the same — builtins map to FFI slots by name.
This is the planned path. It does not exist yet.
---
## 6. Compiler Source Map
| File | Role | Lines |
|------|------|-------|
| `elc-cli.el` | Entry point; imports compiler.el | 7 |
| `el-compiler/src/compiler.el` | Pipeline wiring: lex → parse → codegen. Import resolution, `--emit-header`, `fn main()`. Defines `compile()`, `compile_js()`, `compile_dispatch()`, `resolve_imports()` | 298 |
| `el-compiler/src/lexer.el` | Tokenizer. `lex(source)` → token list. Char helpers, keyword lookup, scan_digits, scan_ident, scan_string, strip_code_comments | 747 |
| `el-compiler/src/parser.el` | Recursive descent parser. `parse(tokens)` → AST. All statement and expression forms | 1071 |
| `el-compiler/src/codegen.el` | C code emitter. `codegen(stmts, source)` → (streams to stdout). Expression codegen, statement codegen, function codegen, type tracking, capability enforcement, temporal type dispatch | 2721 |
| `el-compiler/src/codegen-js.el` | JavaScript backend. `codegen_js(stmts, source)` → JS source | ~500 |
| `el-compiler/runtime/el_runtime.h` | Full runtime API declaration | 755 |
| `el-compiler/runtime/el_runtime.c` | Full runtime implementation | large |
| `el-compiler/runtime/el_runtime.js` | JS runtime | — |
| `elb.el` | Build coordinator. Reads `manifest.el`, walks import graph, compiles modules, links binary. The `.NET`-style incremental build model | 367 |
| `elc-combined.el` | Pre-merged single-file bootstrap edition (for early bootstrap iterations) | large |
| `spec/language.md` | Language specification v1.2.0 | — |
| `dist/platform/elc` | Current bootstrap binary (Mach-O arm64) | — |
---
## 7. Key Decisions and Gotchas
### `target` is a Reserved Keyword
`target` is lexed as the `Target` token kind. It cannot be used as a variable or parameter name anywhere in El source. If you write `fn compile(target: String)`, the parameter name will be tokenized as `Target`, which the parser does not recognize as an `Ident` in parameter position.
**Workaround:** use `tgt`, `dest`, `backend`, or any other name. The compiler source uses `tgt` specifically for this reason. This comes up whenever writing code that handles compilation targets.
### `let x = x + 1` is Let-Rebinding, Not Mutation
El has no mutable variables. `let count = count + 1` re-introduces `count` into the current scope, shadowing the previous binding. At the C level, the codegen tracks declared names and emits plain assignment for subsequent bindings of the same name:
- First `let count = 0` → `el_val_t count = 0;`
- Second `let count = count + 1` → `count = count + 1;`
This means you cannot have two different values named `count` in the same C scope — the second binding overwrites the first. This is by design. Scoped shadowing works correctly because each block (if body, while body, for body) gets its own copy of the `declared` list.
### Arena is Inactive in CLI Mode
The runtime includes an arena allocator designed for long-running server processes. In CLI mode (`elc`, `elb`) the arena is not activated. Memory is managed by ARC (reference counting via `el_retain`/`el_release`). The compiler source explicitly calls `el_release(tokens)` after parsing and `el_release(stmt)` after codegen to prevent memory exhaustion on large source files.
If you are implementing a new runtime or embedding El, be aware that the ARC model expects callers to release values they are done with.
### The `extern fn` / `.elh` Separate Compilation Model
`elb` (the build coordinator) supports separate compilation. When a module changes:
1. `elc --emit-header module.el module.c` compiles the module and writes `module.elh`
2. `module.elh` contains `extern fn` declarations for all public functions
3. Other modules that import `module.el` use the `.elh` header instead of re-parsing the source
The `resolve_imports` function in `compiler.el` checks for a `.elh` file before recursively inlining the `.el` source. If the header exists, it is used (and the `.el` is marked as seen to prevent double-inclusion).
This is important for bootstrap: if you have pre-compiled headers lying around from a broken build, they may shadow updated source. Delete `.elh` files (or use `elb --clean`) when debugging unexpected compilation behavior.
### Import Resolution: Depth-First with Deduplication
`resolve_imports` in `compiler.el`:
1. Walks imports depth-first (dependencies before dependents)
2. Uses `state_set("__elc_imp__:" + path, "1")` to deduplicate: each file is included exactly once
3. Builds the combined source string by concatenating import bodies ahead of the entry file's body
4. If a `.elh` header exists for an import, uses that instead of recursing into the `.el`
The result is one large string that gets passed through `lex` → `parse` → `codegen` as a single unit. The codegen emits forward declarations for all functions before any body, so declaration order within the combined source does not matter.
### `+` Operator Dispatch is Heuristic
El's `+` operator serves double duty: integer addition and string concatenation. The codegen dispatches based on static analysis of the AST:
- If either operand is a `Str` literal → `el_str_concat`
- If both operands are provably `Int` (via `is_int_expr`) → `(a + b)`
- If either operand is a `Call` or `Ident` → `el_str_concat` (conservative fallback)
The `is_int_expr` predicate recurses through the AST: literal `Int`, names in `__int_names` (from `: Int` annotations), known Int-returning builtins, and arithmetic BinOps over Int operands all count as "provably Int."
If you write `let result = some_int_var + 1` and `some_int_var` is not annotated `: Int`, the codegen may emit `el_str_concat` instead of integer addition. Fix by adding `: Int` to the variable declaration.
### `==` Operator Dispatch is Also Heuristic
Similarly, `==` dispatches between `str_eq(a, b)` (string comparison) and `(a == b)` (integer comparison) based on operand types. The codegen tracks Int-typed names in `__int_names`. Two `Ident` operands where both are known Int-typed use `==`; all other Ident-Ident comparisons use `str_eq`.
This means comparing two integer variables that were not annotated `: Int` can silently produce `str_eq` on what are actually integer values — and `str_eq` treats them as `const char*` pointers, producing incorrect results or segfaults.
**Rule:** always annotate variables `: Int` when they will participate in `==` comparisons or `+` arithmetic.
### Capability Kind Enforcement
The codegen classifies programs into three capability tiers based on top-level declarations:
- `cgi` block present → full capability (all primitives allowed)
- `service` block present → restricted (no `llm_call_agentic`, `llm_register_tool`, `dharma_emit`, `dharma_field`)
- Neither → `utility` (no DHARMA, no LLM)
Violations are collected during codegen and emitted as `#error` directives at the bottom of the generated C. The downstream `cc` step then fails with a clear message naming the forbidden call.
### The `__no_block_expr` Parse Guard
When parsing the condition of `if`, `while`, `for`, and `match`, the parser sets `state_set("__no_block_expr", "1")`. This prevents `parse_primary` from treating a `{` as the start of a Map literal — instead it returns `{ "expr": "Nil" }` and the caller sees the `{` and treats it as the block delimiter.
Without this guard, `if a || b { ... }` would recurse into `parse_expr` for `b`, hit `{`, try to parse it as a Map literal, fail to find string keys, loop in error-recovery mode, and hang.
### Codegen Streams Output via `println`
The codegen does not build the output as a string — it calls `println()` for each line as it is emitted. The `compile()` / `compile_js()` / `codegen()` functions return `""`. Output goes to stdout.
This design avoids O(n²) string concatenation for large programs. It also means you cannot capture the compiler's output in a variable within El itself — you must redirect stdout at the OS level (`elc source.el > output.c`).
When writing to a file, `elc` detects the output path argument, redirects C's `stdout` to the file (via `freopen` in the runtime), and the `println` calls go there instead.
+4793
View File
File diff suppressed because it is too large Load Diff
+4773
View File
File diff suppressed because it is too large Load Diff
Vendored Executable
BIN
View File
Binary file not shown.
BIN
View File
Binary file not shown.
Vendored Executable
BIN
View File
Binary file not shown.
+20
View File
@@ -0,0 +1,20 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>CFBundleDevelopmentRegion</key>
<string>English</string>
<key>CFBundleIdentifier</key>
<string>com.apple.xcode.dsym.elc-asan</string>
<key>CFBundleInfoDictionaryVersion</key>
<string>6.0</string>
<key>CFBundlePackageType</key>
<string>dSYM</string>
<key>CFBundleSignature</key>
<string>????</string>
<key>CFBundleShortVersionString</key>
<string>1.0</string>
<key>CFBundleVersion</key>
<string>1</string>
</dict>
</plist>
Binary file not shown.
@@ -0,0 +1,5 @@
---
triple: 'arm64-apple-darwin'
binary-path: '/Users/will/Development/neuron-technologies/foundation/el/dist/platform/elc-asan'
relocations: []
...
Vendored Executable
BIN
View File
Binary file not shown.
BIN
View File
Binary file not shown.
+1699 -74
View File
File diff suppressed because it is too large Load Diff
BIN
View File
Binary file not shown.
Vendored Executable
BIN
View File
Binary file not shown.
Vendored Executable
BIN
View File
Binary file not shown.
Vendored Executable
BIN
View File
Binary file not shown.
+54 -18
View File
@@ -1469,22 +1469,25 @@ void http_serve(el_val_t port, el_val_t handler) {
}
int p = (int)port;
if (p <= 0 || p > 65535) { fprintf(stderr, "http_serve: invalid port %d\n", p); return; }
int sock = socket(AF_INET, SOCK_STREAM, 0);
/* Dual-stack: AF_INET6 with IPV6_V6ONLY=0 accepts both IPv4 and IPv6.
* This makes `localhost` work in browsers that resolve it to ::1 first. */
int sock = socket(AF_INET6, SOCK_STREAM, 0);
if (sock < 0) { perror("socket"); return; }
int yes = 1;
int yes = 1; int no = 0;
setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes));
struct sockaddr_in addr;
setsockopt(sock, IPPROTO_IPV6, IPV6_V6ONLY, &no, sizeof(no));
struct sockaddr_in6 addr;
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = htonl(INADDR_ANY);
addr.sin_port = htons((uint16_t)p);
addr.sin6_family = AF_INET6;
addr.sin6_addr = in6addr_any;
addr.sin6_port = htons((uint16_t)p);
if (bind(sock, (struct sockaddr*)&addr, sizeof(addr)) < 0) {
perror("bind"); close(sock); return;
}
if (listen(sock, 64) < 0) { perror("listen"); close(sock); return; }
fprintf(stderr, "[http] listening on 0.0.0.0:%d\n", p);
fprintf(stderr, "[http] listening on [::]:%d (dual-stack)\n", p);
while (1) {
struct sockaddr_in cli;
struct sockaddr_in6 cli;
socklen_t clen = sizeof(cli);
int cfd = accept(sock, (struct sockaddr*)&cli, &clen);
if (cfd < 0) {
@@ -1715,22 +1718,24 @@ void http_serve_v2(el_val_t port, el_val_t handler) {
fprintf(stderr, "http_serve_v2: invalid port %d\n", p);
return;
}
int sock = socket(AF_INET, SOCK_STREAM, 0);
/* Dual-stack: same as http_serve - AF_INET6 + IPV6_V6ONLY=0. */
int sock = socket(AF_INET6, SOCK_STREAM, 0);
if (sock < 0) { perror("socket"); return; }
int yes = 1;
int yes = 1; int no = 0;
setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes));
struct sockaddr_in addr;
setsockopt(sock, IPPROTO_IPV6, IPV6_V6ONLY, &no, sizeof(no));
struct sockaddr_in6 addr;
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = htonl(INADDR_ANY);
addr.sin_port = htons((uint16_t)p);
addr.sin6_family = AF_INET6;
addr.sin6_addr = in6addr_any;
addr.sin6_port = htons((uint16_t)p);
if (bind(sock, (struct sockaddr*)&addr, sizeof(addr)) < 0) {
perror("bind"); close(sock); return;
}
if (listen(sock, 64) < 0) { perror("listen"); close(sock); return; }
fprintf(stderr, "[http v2] listening on 0.0.0.0:%d\n", p);
fprintf(stderr, "[http v2] listening on [::]:%d (dual-stack)\n", p);
while (1) {
struct sockaddr_in cli;
struct sockaddr_in6 cli;
socklen_t clen = sizeof(cli);
int cfd = accept(sock, (struct sockaddr*)&cli, &clen);
if (cfd < 0) {
@@ -1848,6 +1853,29 @@ el_val_t fs_write_bytes(el_val_t pathv, el_val_t bytesv, el_val_t lengthv) {
return 1;
}
// exec_command — run a shell command, return exit code (0 = success).
// Used by elb and other El tooling to invoke subprocesses.
el_val_t exec_command(el_val_t cmdv) {
const char* cmd = EL_CSTR(cmdv);
if (!cmd) return (el_val_t)(int64_t)-1;
int ret = system(cmd);
return (el_val_t)(int64_t)ret;
}
// exec_capture — run a shell command, capture stdout, return as String.
// Returns "" on failure.
el_val_t exec_capture(el_val_t cmdv) {
const char* cmd = EL_CSTR(cmdv);
if (!cmd) return el_wrap_str(el_strdup(""));
FILE* f = popen(cmd, "r");
if (!f) return el_wrap_str(el_strdup(""));
JsonBuf b; jb_init(&b);
char buf[4096];
while (fgets(buf, sizeof(buf), f)) jb_puts(&b, buf);
pclose(f);
return el_wrap_str(b.buf);
}
el_val_t fs_list(el_val_t pathv) {
const char* path = EL_CSTR(pathv);
el_val_t lst = el_list_empty();
@@ -2911,8 +2939,13 @@ static int looks_like_string(el_val_t v) {
const unsigned char* s = (const unsigned char*)p;
for (int i = 0; i < 16; i++) {
unsigned char c = s[i];
if (c == '\0') return i > 0; /* terminated string */
if (c < 0x09 || (c > 0x0d && c < 0x20) || c >= 0x7f) return 0;
if (c == '\0') return 1; /* terminated string (empty string is still a valid string) */
/* Reject C0 control chars (non-whitespace), allow UTF-8 high bytes.
* 0x09-0x0d = tab/newline/cr/vt/ff (whitespace, OK)
* 0x20-0x7e = printable ASCII (OK)
* 0x7f = DEL (reject)
* 0x80-0xff = UTF-8 continuation/lead bytes (OK for multi-byte chars) */
if (c < 0x09 || (c > 0x0d && c < 0x20) || c == 0x7f) return 0;
}
return 1; /* 16+ printable bytes — call it a string */
}
@@ -3094,6 +3127,9 @@ el_val_t json_get_raw(el_val_t json_str, el_val_t key) {
const char* json = EL_CSTR(json_str);
const char* k = EL_CSTR(key);
const char* p = json_find_key(json, k);
/* Clear fs_read binary-length hint — result is a fresh null-terminated
* string, not the raw file bytes, so Content-Length must use strlen. */
_tl_fs_read_len = 0;
if (!p) return el_wrap_str(el_strdup(""));
const char* end = json_skip_value(p);
size_t n = (size_t)(end - p);
+4
View File
@@ -739,6 +739,10 @@ el_val_t map_set(el_val_t map, el_val_t key, el_val_t value); /* el_map_set */
/* See bottom of el_runtime.c for the implementation.
* Configured by env vars OTLP_ENDPOINT, OTEL_SERVICE_NAME, OTEL_SERVICE_VERSION.
* No-op when OTLP_ENDPOINT is unset. Drop-on-failure semantics. */
/* ── Subprocess execution ────────────────────────────────────────────────── */
el_val_t exec_command(el_val_t cmd); /* run shell command, return exit code */
el_val_t exec_capture(el_val_t cmd); /* run shell command, capture stdout */
el_val_t emit_log(el_val_t level, el_val_t msg, el_val_t fields_json);
el_val_t emit_metric(el_val_t name, el_val_t value, el_val_t tags_json);
el_val_t trace_span_start(el_val_t name);
+679
View File
@@ -0,0 +1,679 @@
/*
* el_runtime.js El language JS runtime.
*
* The browser/Node analog of el_runtime.c. Compiled-from-El JS source
* imports this file once; it side-effects globalThis.__el with every
* builtin, so generated programs can destructure the names they need
* (see codegen-js.el's preamble).
*
* Value model:
* El's tagged el_val_t collapses into JS native types here:
* String -> string
* Int -> number (caveat: only 53 bits of integer precision)
* Float -> number (already a double)
* Bool -> boolean
* [T] -> Array
* Map<,> -> plain object
* Void -> undefined
* null -> null
*
* Runtime mode auto-detection:
* typeof window === 'undefined' -> Node mode
* otherwise -> Browser mode
*
* See spec/codegen-js.md for the full design rationale.
*/
const IS_NODE = typeof window === 'undefined' && typeof process !== 'undefined' && process.versions != null && process.versions.node != null;
// ── I/O ─────────────────────────────────────────────────────────────────────
function println(s) {
if (IS_NODE) {
process.stdout.write(String(s) + '\n');
} else {
console.log(String(s));
}
}
function print(s) {
if (IS_NODE) {
process.stdout.write(String(s));
} else {
// Browser has no stdout — fall back to console with no newline group
console.log(String(s));
}
}
// ── String builtins ─────────────────────────────────────────────────────────
// Coerce both args to string and concat. Mirrors el_str_concat in C;
// the C version handles both string-and-string and string-and-int.
function el_str_concat(a, b) {
return String(a) + String(b);
}
function str_concat(a, b) { return el_str_concat(a, b); }
// Strict equality with string coercion. Matches str_eq() in C — which
// strcmp's the underlying char*. Here we just === after coercion.
function str_eq(a, b) {
if (a === null || b === null) return a === b;
return String(a) === String(b);
}
function str_starts_with(s, prefix) {
return String(s).startsWith(String(prefix));
}
function str_ends_with(s, suffix) {
return String(s).endsWith(String(suffix));
}
function str_len(s) {
return String(s).length;
}
function int_to_str(n) {
return String(n);
}
function str_to_int(s) {
const n = parseInt(String(s), 10);
return Number.isNaN(n) ? 0 : n;
}
function str_slice(s, start, end) {
return String(s).slice(start, end);
}
function str_contains(s, sub) {
return String(s).indexOf(String(sub)) >= 0;
}
function str_replace(s, from, to) {
// Replace ALL occurrences (matches C runtime semantics).
return String(s).split(String(from)).join(String(to));
}
function str_to_upper(s) { return String(s).toUpperCase(); }
function str_to_lower(s) { return String(s).toLowerCase(); }
function str_upper(s) { return String(s).toUpperCase(); }
function str_lower(s) { return String(s).toLowerCase(); }
function str_trim(s) { return String(s).trim(); }
function str_index_of(s, sub) {
return String(s).indexOf(String(sub));
}
function str_split(s, sep) {
return String(s).split(String(sep));
}
function str_char_at(s, i) {
return String(s).charAt(i);
}
function str_char_code(s, i) {
const c = String(s).charCodeAt(i);
return Number.isNaN(c) ? 0 : c;
}
function str_pad_left(s, width, pad) {
return String(s).padStart(width, String(pad));
}
function str_pad_right(s, width, pad) {
return String(s).padEnd(width, String(pad));
}
// ── Math ────────────────────────────────────────────────────────────────────
function el_abs(n) { return Math.abs(n); }
function el_max(a, b) { return a > b ? a : b; }
function el_min(a, b) { return a < b ? a : b; }
// ── Refcount (no-op — JS has GC) ────────────────────────────────────────────
function el_retain(_v) { /* no-op */ }
function el_release(_v) { /* no-op */ }
// ── List ────────────────────────────────────────────────────────────────────
// Variadic constructor matching el_list_new(count, items...). Exposed so
// codegen-js can emit the same call shape if we ever want it (currently
// codegen-js emits JS array literals directly).
function el_list_new(_count, ...items) {
return items.slice(0);
}
function el_list_empty() { return []; }
function el_list_clone(list) { return Array.isArray(list) ? list.slice() : []; }
function el_list_len(list) { return Array.isArray(list) ? list.length : 0; }
function el_list_get(list, index) {
if (!Array.isArray(list)) return null;
if (index < 0 || index >= list.length) return null;
return list[index];
}
function el_list_append(list, elem) {
if (!Array.isArray(list)) return [elem];
const out = list.slice();
out.push(elem);
return out;
}
function list_push(list, elem) { return el_list_append(list, elem); }
function list_push_front(list, elem) {
if (!Array.isArray(list)) return [elem];
return [elem, ...list];
}
function list_join(list, sep) {
if (!Array.isArray(list)) return '';
return list.map(String).join(String(sep));
}
function list_range(start, end) {
const out = [];
for (let i = start; i < end; i++) out.push(i);
return out;
}
// ── Map ─────────────────────────────────────────────────────────────────────
// Variadic constructor (key, val, key, val, ...).
function el_map_new(_pairCount, ...kvs) {
const out = {};
for (let i = 0; i < kvs.length; i += 2) {
out[String(kvs[i])] = kvs[i + 1];
}
return out;
}
function el_get_field(map, key) {
if (map === null || map === undefined) return null;
if (typeof map !== 'object') return null;
const k = String(key);
if (Object.prototype.hasOwnProperty.call(map, k)) return map[k];
return null;
}
function el_map_get(map, key) { return el_get_field(map, key); }
function el_map_set(map, key, value) {
// Match the C runtime: shallow-copy + set, persistent semantics.
const out = (map && typeof map === 'object') ? { ...map } : {};
out[String(key)] = value;
return out;
}
// ── Method-call shorthand aliases ──────────────────────────────────────────
// `obj.method(args)` compiles to `method(obj, args)` per El convention.
function append(list, elem) { return el_list_append(list, elem); }
function len(v) {
if (Array.isArray(v)) return v.length;
if (typeof v === 'string') return v.length;
if (v && typeof v === 'object') return Object.keys(v).length;
return 0;
}
function get(list, index) { return el_list_get(list, index); }
function map_get(m, k) { return el_get_field(m, k); }
function map_set(m, k, v) { return el_map_set(m, k, v); }
// ── Native VM aliases ──────────────────────────────────────────────────────
function native_list_get(list, index) { return el_list_get(list, index); }
function native_list_len(list) { return el_list_len(list); }
function native_list_append(list, elem) { return el_list_append(list, elem); }
function native_list_empty() { return []; }
function native_list_clone(list) { return el_list_clone(list); }
function native_string_chars(s) { return String(s).split(''); }
function native_int_to_str(n) { return String(n); }
// ── HTTP ───────────────────────────────────────────────────────────────────
//
// fetch() is async. These return Promise<string>. Generated El code does
// not yet emit await — that's the async-taint pass (see spec §5). For
// programs that don't touch HTTP this is fine; for programs that do,
// the value will appear as "[object Promise]" until the taint pass lands.
function http_get(url) {
if (typeof fetch === 'undefined') {
throw new Error('http_get: fetch() not available in this runtime');
}
return fetch(String(url)).then(r => r.text());
}
function http_post(url, body) {
if (typeof fetch === 'undefined') {
throw new Error('http_post: fetch() not available in this runtime');
}
return fetch(String(url), { method: 'POST', body: String(body) }).then(r => r.text());
}
function http_post_json(url, jsonBody) {
if (typeof fetch === 'undefined') {
throw new Error('http_post_json: fetch() not available in this runtime');
}
return fetch(String(url), {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: String(jsonBody),
}).then(r => r.text());
}
function http_get_with_headers(url, headersMap) {
if (typeof fetch === 'undefined') {
throw new Error('http_get_with_headers: fetch() not available');
}
return fetch(String(url), { headers: headersMap || {} }).then(r => r.text());
}
function http_post_with_headers(url, body, headersMap) {
if (typeof fetch === 'undefined') {
throw new Error('http_post_with_headers: fetch() not available');
}
return fetch(String(url), {
method: 'POST',
headers: headersMap || {},
body: String(body),
}).then(r => r.text());
}
function http_serve(_port, _handler) {
throw new Error('http_serve: not supported in JS target — needs server-side runtime mode');
}
function http_set_handler(_name) {
throw new Error('http_set_handler: not supported in JS target');
}
// ── Filesystem (Node-only) ─────────────────────────────────────────────────
function _ensureNode(name) {
if (!IS_NODE) {
throw new Error(`${name}: not supported in browser runtime`);
}
}
function fs_read(path) {
_ensureNode('fs_read');
const fs = require('node:fs');
try {
return fs.readFileSync(String(path), 'utf8');
} catch (_e) {
return '';
}
}
function fs_write(path, content) {
_ensureNode('fs_write');
const fs = require('node:fs');
try {
fs.writeFileSync(String(path), String(content));
return true;
} catch (_e) {
return false;
}
}
function fs_list(path) {
_ensureNode('fs_list');
const fs = require('node:fs');
try {
return fs.readdirSync(String(path));
} catch (_e) {
return [];
}
}
// ── JSON ───────────────────────────────────────────────────────────────────
function json_parse(s) {
try { return JSON.parse(String(s)); }
catch (_e) { return null; }
}
function json_stringify(v) {
try { return JSON.stringify(v); }
catch (_e) { return ''; }
}
function json_get(jsonStr, key) {
const o = json_parse(jsonStr);
if (o === null) return null;
return el_get_field(o, key);
}
function json_get_string(jsonStr, key) {
const v = json_get(jsonStr, key);
return v === null ? '' : String(v);
}
function json_get_int(jsonStr, key) {
const v = json_get(jsonStr, key);
if (typeof v === 'number') return Math.trunc(v);
if (typeof v === 'string') return str_to_int(v);
return 0;
}
function json_get_float(jsonStr, key) {
const v = json_get(jsonStr, key);
return typeof v === 'number' ? v : 0;
}
function json_get_bool(jsonStr, key) {
const v = json_get(jsonStr, key);
return v === true;
}
function json_get_raw(jsonStr, key) {
const v = json_get(jsonStr, key);
return v === null ? '' : json_stringify(v);
}
function json_set(jsonStr, key, value) {
const o = json_parse(jsonStr) ?? {};
o[String(key)] = value;
return json_stringify(o);
}
function json_array_len(jsonStr) {
const o = json_parse(jsonStr);
return Array.isArray(o) ? o.length : 0;
}
// ── Time ───────────────────────────────────────────────────────────────────
function time_now() {
return Math.floor(Date.now() / 1000);
}
function time_now_utc() {
// In the C runtime this returns nanoseconds since epoch. JS number
// can't represent that range past ~2^53. We return milliseconds — a
// safe range — and document the divergence.
return Date.now();
}
function sleep_secs(secs) {
if (!IS_NODE) {
throw new Error('sleep_secs: blocking sleep not supported in browser');
}
// Simple sync sleep via Atomics.wait on a SharedArrayBuffer-backed Int32.
const sab = new SharedArrayBuffer(4);
const i32 = new Int32Array(sab);
Atomics.wait(i32, 0, 0, Math.floor(secs * 1000));
return secs;
}
function sleep_ms(ms) {
if (!IS_NODE) {
throw new Error('sleep_ms: blocking sleep not supported in browser');
}
const sab = new SharedArrayBuffer(4);
const i32 = new Int32Array(sab);
Atomics.wait(i32, 0, 0, Math.floor(ms));
return ms;
}
// ── Bool ───────────────────────────────────────────────────────────────────
function bool_to_str(b) { return b ? 'true' : 'false'; }
// ── Process ────────────────────────────────────────────────────────────────
function exit_program(code) {
if (IS_NODE) {
process.exit(code | 0);
} else {
throw new Error(`exit_program(${code}) called in browser`);
}
}
// ── args() ─────────────────────────────────────────────────────────────────
function args() {
if (IS_NODE) {
// process.argv is [node, script, ...args] — slice off node + script.
return process.argv.slice(2);
}
return [];
}
// ── env ────────────────────────────────────────────────────────────────────
function env(key) {
if (IS_NODE) {
const v = process.env[String(key)];
return v === undefined ? null : v;
}
return null;
}
// ── In-process state K/V ───────────────────────────────────────────────────
const _stateMap = new Map();
function state_set(key, value) {
_stateMap.set(String(key), value);
return value;
}
function state_get(key) {
const v = _stateMap.get(String(key));
return v === undefined ? '' : v;
}
function state_del(key) {
return _stateMap.delete(String(key));
}
function state_keys() {
return Array.from(_stateMap.keys());
}
// ── UUID ───────────────────────────────────────────────────────────────────
function uuid_v4() {
// RFC 4122-ish — uses crypto when available, falls back to Math.random.
if (typeof crypto !== 'undefined' && crypto.randomUUID) {
return crypto.randomUUID();
}
return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, c => {
const r = Math.random() * 16 | 0;
const v = c === 'x' ? r : (r & 0x3 | 0x8);
return v.toString(16);
});
}
function uuid_new() { return uuid_v4(); }
// ── Float formatting ───────────────────────────────────────────────────────
function float_to_str(f) { return String(f); }
function int_to_float(n) { return n; }
function float_to_int(f) { return Math.trunc(f); }
function format_float(f, decimals) {
return Number(f).toFixed(decimals);
}
function decimal_round(f, decimals) {
const m = Math.pow(10, decimals);
return Math.round(f * m) / m;
}
function str_to_float(s) {
const n = parseFloat(String(s));
return Number.isNaN(n) ? 0 : n;
}
// ── Math (Float-aware) ─────────────────────────────────────────────────────
function math_sqrt(f) { return Math.sqrt(f); }
function math_log(f) { return Math.log10(f); }
function math_ln(f) { return Math.log(f); }
function math_sin(f) { return Math.sin(f); }
function math_cos(f) { return Math.cos(f); }
function math_pi() { return Math.PI; }
// ── Stubs for not-yet-supported features ───────────────────────────────────
//
// These compile but throw when called. See spec/codegen-js.md §7.
function _notSupported(name) {
return () => { throw new Error(`${name}: not supported in JS target — needs server-side delegation`); };
}
// CGI identity
function el_cgi_init(_name, _did, _principal, _network, _engram) {
// No-op — UI code is not a CGI principal. See spec §7.
}
// DHARMA — all stubbed.
const dharma_connect = _notSupported('dharma_connect');
const dharma_send = _notSupported('dharma_send');
const dharma_activate = _notSupported('dharma_activate');
const dharma_emit = _notSupported('dharma_emit');
const dharma_field = _notSupported('dharma_field');
const dharma_strengthen = _notSupported('dharma_strengthen');
const dharma_relationship = _notSupported('dharma_relationship');
const dharma_peers = _notSupported('dharma_peers');
// Engram — stubbed (could be ported to in-browser later).
const engram_node = _notSupported('engram_node');
const engram_node_full = _notSupported('engram_node_full');
const engram_get_node = _notSupported('engram_get_node');
const engram_strengthen = _notSupported('engram_strengthen');
const engram_forget = _notSupported('engram_forget');
const engram_node_count = _notSupported('engram_node_count');
const engram_search = _notSupported('engram_search');
const engram_scan_nodes = _notSupported('engram_scan_nodes');
const engram_connect = _notSupported('engram_connect');
const engram_edge_between = _notSupported('engram_edge_between');
const engram_neighbors = _notSupported('engram_neighbors');
const engram_neighbors_filtered = _notSupported('engram_neighbors_filtered');
const engram_edge_count = _notSupported('engram_edge_count');
const engram_activate = _notSupported('engram_activate');
const engram_save = _notSupported('engram_save');
const engram_load = _notSupported('engram_load');
// LLM — stubbed (browser cannot hold API keys safely).
const llm_call = _notSupported('llm_call');
const llm_call_system = _notSupported('llm_call_system');
const llm_call_agentic = _notSupported('llm_call_agentic');
const llm_vision = _notSupported('llm_vision');
const llm_models = _notSupported('llm_models');
const llm_register_tool = _notSupported('llm_register_tool');
// Crypto — stubbed; could be backed by SubtleCrypto later.
const sha256_hex = _notSupported('sha256_hex');
const sha256_bytes = _notSupported('sha256_bytes');
const hmac_sha256_hex = _notSupported('hmac_sha256_hex');
const hmac_sha256_bytes = _notSupported('hmac_sha256_bytes');
const base64_encode = _notSupported('base64_encode');
const base64_decode = _notSupported('base64_decode');
const base64url_encode = _notSupported('base64url_encode');
const base64url_decode = _notSupported('base64url_decode');
// ── Export to globalThis.__el ──────────────────────────────────────────────
//
// Generated programs destructure off this object. Keeping it on globalThis
// means a single `import "./el_runtime.js"` is enough; no per-call namespace
// prefix is required at codegen time.
const __el = {
// I/O
println, print,
// String
el_str_concat, str_concat, str_eq, str_starts_with, str_ends_with,
str_len, int_to_str, str_to_int, str_slice, str_contains, str_replace,
str_to_upper, str_to_lower, str_trim, str_index_of, str_split, str_char_at,
str_char_code, str_lower, str_upper, str_pad_left, str_pad_right,
// Math
el_abs, el_max, el_min,
// Refcount
el_retain, el_release,
// List
el_list_new, el_list_empty, el_list_clone, el_list_len, el_list_get,
el_list_append, list_push, list_push_front, list_join, list_range,
// Map
el_map_new, el_get_field, el_map_get, el_map_set,
// Method-call shortforms
append, len, get, map_get, map_set,
// Native VM aliases
native_list_get, native_list_len, native_list_append, native_list_empty,
native_list_clone, native_string_chars, native_int_to_str,
// HTTP
http_get, http_post, http_post_json, http_get_with_headers,
http_post_with_headers, http_serve, http_set_handler,
// FS
fs_read, fs_write, fs_list,
// JSON
json_parse, json_stringify, json_get, json_get_string, json_get_int,
json_get_float, json_get_bool, json_get_raw, json_set, json_array_len,
// Time
time_now, time_now_utc, sleep_secs, sleep_ms,
// Bool
bool_to_str,
// Process
exit_program,
// Args / env
args, env,
// State
state_set, state_get, state_del, state_keys,
// UUID
uuid_v4, uuid_new,
// Float / math
float_to_str, int_to_float, float_to_int, format_float, decimal_round,
str_to_float, math_sqrt, math_log, math_ln, math_sin, math_cos, math_pi,
// CGI / DHARMA / Engram / LLM (stubs)
el_cgi_init,
dharma_connect, dharma_send, dharma_activate, dharma_emit, dharma_field,
dharma_strengthen, dharma_relationship, dharma_peers,
engram_node, engram_node_full, engram_get_node, engram_strengthen,
engram_forget, engram_node_count, engram_search, engram_scan_nodes,
engram_connect, engram_edge_between, engram_neighbors,
engram_neighbors_filtered, engram_edge_count, engram_activate,
engram_save, engram_load,
llm_call, llm_call_system, llm_call_agentic, llm_vision,
llm_models, llm_register_tool,
// Crypto (stubs)
sha256_hex, sha256_bytes, hmac_sha256_hex, hmac_sha256_bytes,
base64_encode, base64_decode, base64url_encode, base64url_decode,
};
globalThis.__el = __el;
// Also re-export as ES module exports for consumers that prefer that style.
export { __el as default };
export {
println, print,
el_str_concat, str_concat, str_eq, str_starts_with, str_ends_with,
str_len, int_to_str, str_to_int, str_slice, str_contains, str_replace,
str_to_upper, str_to_lower, str_trim, str_index_of, str_split, str_char_at,
str_char_code, str_lower, str_upper,
el_abs, el_max, el_min,
el_retain, el_release,
el_list_new, el_list_empty, el_list_clone, el_list_len, el_list_get,
el_list_append, list_push, list_push_front, list_join, list_range,
el_map_new, el_get_field, el_map_get, el_map_set,
append, len, get, map_get, map_set,
native_list_get, native_list_len, native_list_append, native_list_empty,
native_list_clone, native_string_chars, native_int_to_str,
http_get, http_post, http_post_json,
fs_read, fs_write, fs_list,
json_parse, json_stringify, json_get, json_get_string, json_get_int,
time_now, time_now_utc, sleep_ms,
bool_to_str, exit_program, args, env,
state_set, state_get, state_del, state_keys,
el_cgi_init,
dharma_connect, dharma_send, dharma_activate, dharma_emit, dharma_field,
engram_node, engram_search, engram_activate,
llm_call, llm_call_system,
};
+926
View File
@@ -0,0 +1,926 @@
// codegen-js.el El compiler JavaScript source code generator
//
// Input: list of AST statement maps (from parser.el)
// Output: JavaScript source printed to stdout (streamed, one line at a time)
//
// Each El program compiles to a single .js file that imports el_runtime.js
// (which side-effects globals so call sites stay flat println(x), not
// el.println(x)). Functions map to JS function declarations; top-level
// statements run at module load.
//
// Entry point: fn codegen_js(stmts: [Map<String, Any>], source: String) -> String
// Returns "" output goes to stdout via println().
//
// This file mirrors codegen.el (the C backend). Where the C backend has to
// fight the int64_t-everywhere convention to dispatch arithmetic vs concat
// or `==` vs `str_eq`, the JS backend can usually let JS's own operator
// semantics do the right thing. We retain the dispatch logic for clarity
// and so that explicit calls to `el_str_concat` or `str_eq` still work.
// String helpers
// Escape a JS string literal (double-quotes, backslashes, newlines, etc.).
fn js_escape(s: String) -> String {
let chars: [String] = native_string_chars(s)
let total: Int = native_list_len(chars)
let parts: [String] = native_list_empty()
let i = 0
while i < total {
let ch: String = native_list_get(chars, i)
if ch == "\"" {
let parts = native_list_append(parts, "\\\"")
} else {
if ch == "\\" {
let parts = native_list_append(parts, "\\\\")
} else {
if ch == "\n" {
let parts = native_list_append(parts, "\\n")
} else {
if ch == "\r" {
let parts = native_list_append(parts, "\\r")
} else {
if ch == "\t" {
let parts = native_list_append(parts, "\\t")
} else {
let parts = native_list_append(parts, ch)
}
}
}
}
}
let i = i + 1
}
str_join(parts, "")
}
fn js_str_lit(s: String) -> String {
"\"" + js_escape(s) + "\""
}
// Code emission
fn js_emit_line(line: String) -> Void {
println(line)
}
fn js_emit_blank() -> Void {
println("")
}
// Operator helpers
fn js_binop(op: String) -> String {
if op == "Plus" { return "+" }
if op == "Minus" { return "-" }
if op == "Star" { return "*" }
if op == "Slash" { return "/" }
if op == "Percent" { return "%" }
if op == "EqEq" { return "===" }
if op == "NotEq" { return "!==" }
if op == "Lt" { return "<" }
if op == "Gt" { return ">" }
if op == "LtEq" { return "<=" }
if op == "GtEq" { return ">=" }
if op == "And" { return "&&" }
if op == "Or" { return "||" }
op
}
// Int-name tracking (mirrors codegen.el)
fn js_is_int_name(name: String) -> Bool {
let csv: String = state_get("__js_int_names")
if str_eq(csv, "") { return false }
return str_contains(csv, "," + name + ",")
}
fn js_add_int_name(name: String) -> Bool {
let csv: String = state_get("__js_int_names")
if str_eq(csv, "") { csv = "," }
let key: String = "," + name + ","
if str_contains(csv, key) { return true }
state_set("__js_int_names", csv + name + ",")
return true
}
fn js_build_int_names_for_params(params: [Map<String, Any>]) -> Bool {
state_set("__js_int_names", ",")
let np: Int = native_list_len(params)
let pi = 0
while pi < np {
let param = native_list_get(params, pi)
let pname: String = param["name"]
let ptype: String = param["type"]
if str_eq(ptype, "Int") {
js_add_int_name(pname)
}
let pi = pi + 1
}
return true
}
fn js_is_int_call(call_expr: Map<String, Any>) -> Bool {
let func = call_expr["func"]
let fk: String = func["expr"]
if !str_eq(fk, "Ident") { return false }
let name: String = func["name"]
if str_eq(name, "str_len") { return true }
if str_eq(name, "str_index_of") { return true }
if str_eq(name, "str_to_int") { return true }
if str_eq(name, "str_char_code") { return true }
if str_eq(name, "native_list_len") { return true }
if str_eq(name, "el_list_len") { return true }
if str_eq(name, "len") { return true }
if str_eq(name, "json_get_int") { return true }
if str_eq(name, "time_now") { return true }
if str_eq(name, "time_now_utc") { return true }
if str_eq(name, "el_abs") { return true }
if str_eq(name, "el_max") { return true }
if str_eq(name, "el_min") { return true }
return false
}
// Expression codegen
//
// js_cg_expr returns a JS expression string (not a statement).
//
// Note: the C backend's `+` dispatch is preserved here for two reasons:
// 1) Generated output stays grep-equivalent across targets
// 2) Explicit `el_str_concat()` lives in the runtime; codegen routes
// through it for ambiguous (Ident+Ident, Call+Call) cases. JS's
// own `+` would also work, but el_str_concat coerces both sides
// to strings closer to the C semantics.
fn js_cg_expr(expr: Map<String, Any>) -> String {
let kind: String = expr["expr"]
if kind == "Int" {
let v: String = expr["value"]
return v
}
// DurationLit postfix-literal time value (e.g. 30.seconds, 1.hour).
// The JS backend lowers to a literal integer nanosecond count. The C
// backend uses the typed wrapper el_duration_from_nanos to make intent
// explicit at the runtime boundary; JS has no equivalent shim yet, so
// we lower directly. A future Phase 2 JS time runtime can route through
// a wrapper once added.
if kind == "DurationLit" {
let count: String = expr["count"]
let unit: String = expr["unit"]
let mult_ns = "1"
if str_eq(unit, "nano") { let mult_ns = "1" }
if str_eq(unit, "nanos") { let mult_ns = "1" }
if str_eq(unit, "milli") { let mult_ns = "1000000" }
if str_eq(unit, "millis") { let mult_ns = "1000000" }
if str_eq(unit, "millisecond") { let mult_ns = "1000000" }
if str_eq(unit, "milliseconds") { let mult_ns = "1000000" }
if str_eq(unit, "second") { let mult_ns = "1000000000" }
if str_eq(unit, "seconds") { let mult_ns = "1000000000" }
if str_eq(unit, "minute") { let mult_ns = "60000000000" }
if str_eq(unit, "minutes") { let mult_ns = "60000000000" }
if str_eq(unit, "hour") { let mult_ns = "3600000000000" }
if str_eq(unit, "hours") { let mult_ns = "3600000000000" }
if str_eq(unit, "day") { let mult_ns = "86400000000000" }
if str_eq(unit, "days") { let mult_ns = "86400000000000" }
return "(" + count + " * " + mult_ns + ")"
}
if kind == "Float" {
// JS numbers are already doubles no bit-cast trick needed.
let v: String = expr["value"]
return v
}
if kind == "Str" {
let v: String = expr["value"]
return js_str_lit(v)
}
if kind == "Bool" {
let v: String = expr["value"]
if v == "true" { return "true" }
return "false"
}
if kind == "Nil" {
return "null"
}
if kind == "Ident" {
let name: String = expr["name"]
return name
}
if kind == "Not" {
let inner = expr["inner"]
let inner_c: String = js_cg_expr(inner)
return "!" + inner_c
}
if kind == "Neg" {
let inner = expr["inner"]
let inner_c: String = js_cg_expr(inner)
return "(-" + inner_c + ")"
}
if kind == "BinOp" {
let op: String = expr["op"]
let left = expr["left"]
let right = expr["right"]
let left_c: String = js_cg_expr(left)
let right_c: String = js_cg_expr(right)
let left_kind: String = left["expr"]
let right_kind: String = right["expr"]
// Plus dispatch same shape as C backend, but we route through
// el_str_concat for the string-concat path (its JS impl coerces
// and matches C's behavior). Arithmetic uses bare JS `+`.
if op == "Plus" {
if left_kind == "Str" {
return "el_str_concat(" + left_c + ", " + right_c + ")"
}
if right_kind == "Str" {
return "el_str_concat(" + left_c + ", " + right_c + ")"
}
if left_kind == "Int" {
return "(" + left_c + " + " + right_c + ")"
}
if right_kind == "Int" {
return "(" + left_c + " + " + right_c + ")"
}
if left_kind == "Ident" {
if right_kind == "Ident" {
let lname: String = left["name"]
let rname: String = right["name"]
if js_is_int_name(lname) {
if js_is_int_name(rname) {
return "(" + left_c + " + " + right_c + ")"
}
}
}
}
if left_kind == "Ident" {
if right_kind == "Call" {
let lname: String = left["name"]
if js_is_int_name(lname) {
if js_is_int_call(right) {
return "(" + left_c + " + " + right_c + ")"
}
}
}
}
if right_kind == "Ident" {
if left_kind == "Call" {
let rname: String = right["name"]
if js_is_int_name(rname) {
if js_is_int_call(left) {
return "(" + left_c + " + " + right_c + ")"
}
}
}
}
if left_kind == "Call" {
if right_kind == "Call" {
if js_is_int_call(left) {
if js_is_int_call(right) {
return "(" + left_c + " + " + right_c + ")"
}
}
}
return "el_str_concat(" + left_c + ", " + right_c + ")"
}
if right_kind == "Call" {
return "el_str_concat(" + left_c + ", " + right_c + ")"
}
// Fallback: when in doubt, route through el_str_concat. JS's
// own + handles strings and numbers natively, but el_str_concat
// gives us a single point of control if behavior needs to diverge.
if left_kind == "Ident" {
return "el_str_concat(" + left_c + ", " + right_c + ")"
}
if right_kind == "Ident" {
return "el_str_concat(" + left_c + ", " + right_c + ")"
}
}
// Equality dispatch C backend disambiguates via str_eq for
// strings and == for ints. JS does both with === if we know
// the types are uniform; for ambiguous identifier pairs we
// route through str_eq for safety (it falls back to === in JS).
if op == "EqEq" {
if left_kind == "Int" { return "(" + left_c + " === " + right_c + ")" }
if right_kind == "Int" { return "(" + left_c + " === " + right_c + ")" }
if left_kind == "Bool" { return "(" + left_c + " === " + right_c + ")" }
if right_kind == "Bool" { return "(" + left_c + " === " + right_c + ")" }
if left_kind == "Nil" { return "(" + left_c + " === " + right_c + ")" }
if right_kind == "Nil" { return "(" + left_c + " === " + right_c + ")" }
if left_kind == "Ident" {
if right_kind == "Ident" {
let lname: String = left["name"]
let rname: String = right["name"]
if js_is_int_name(lname) {
if js_is_int_name(rname) {
return "(" + left_c + " === " + right_c + ")"
}
}
}
}
if left_kind == "Str" { return "str_eq(" + left_c + ", " + right_c + ")" }
if right_kind == "Str" { return "str_eq(" + left_c + ", " + right_c + ")" }
// Default: === (works for strings, numbers, bools in JS)
return "(" + left_c + " === " + right_c + ")"
}
if op == "NotEq" {
if left_kind == "Int" { return "(" + left_c + " !== " + right_c + ")" }
if right_kind == "Int" { return "(" + left_c + " !== " + right_c + ")" }
if left_kind == "Bool" { return "(" + left_c + " !== " + right_c + ")" }
if right_kind == "Bool" { return "(" + left_c + " !== " + right_c + ")" }
if left_kind == "Nil" { return "(" + left_c + " !== " + right_c + ")" }
if right_kind == "Nil" { return "(" + left_c + " !== " + right_c + ")" }
if left_kind == "Ident" {
if right_kind == "Ident" {
let lname: String = left["name"]
let rname: String = right["name"]
if js_is_int_name(lname) {
if js_is_int_name(rname) {
return "(" + left_c + " !== " + right_c + ")"
}
}
}
}
if left_kind == "Str" { return "!str_eq(" + left_c + ", " + right_c + ")" }
if right_kind == "Str" { return "!str_eq(" + left_c + ", " + right_c + ")" }
return "(" + left_c + " !== " + right_c + ")"
}
let op_c: String = js_binop(op)
return "(" + left_c + " " + op_c + " " + right_c + ")"
}
if kind == "Call" {
let func = expr["func"]
let args = expr["args"]
let arity: Int = native_list_len(args)
let func_kind: String = func["expr"]
let args_parts: [String] = native_list_empty()
let i = 0
while i < arity {
let arg = native_list_get(args, i)
let arg_c: String = js_cg_expr(arg)
let args_parts = native_list_append(args_parts, arg_c)
let i = i + 1
}
let args_c: String = str_join(args_parts, ", ")
if func_kind == "Ident" {
let fn_name: String = func["name"]
return fn_name + "(" + args_c + ")"
}
if func_kind == "Field" {
// El's `obj.method(args)` becomes `method(obj, args)` same
// convention as the C backend. The runtime exports method
// shortforms (append, len, get, map_get, map_set) that match.
let obj = func["object"]
let field: String = func["field"]
let obj_c: String = js_cg_expr(obj)
if arity > 0 {
return field + "(" + obj_c + ", " + args_c + ")"
}
return field + "(" + obj_c + ")"
}
let fn_c: String = js_cg_expr(func)
return fn_c + "(" + args_c + ")"
}
if kind == "Field" {
// El's `obj.foo` becomes JS `obj["foo"]` works on plain objects
// (maps) and on JS objects with prototype. el_get_field is a
// runtime helper for callers that want EL_NULL on missing keys.
let obj = expr["object"]
let field: String = expr["field"]
let obj_c: String = js_cg_expr(obj)
return "el_get_field(" + obj_c + ", " + js_str_lit(field) + ")"
}
if kind == "Index" {
// Map vs list dispatch on the index expression kind, same as C.
let obj = expr["object"]
let idx = expr["index"]
let obj_c: String = js_cg_expr(obj)
let idx_c: String = js_cg_expr(idx)
let idx_kind: String = idx["expr"]
if str_eq(idx_kind, "Str") {
return "el_get_field(" + obj_c + ", " + idx_c + ")"
}
return "el_list_get(" + obj_c + ", " + idx_c + ")"
}
if kind == "Array" {
let elems = expr["elems"]
let n: Int = native_list_len(elems)
if n == 0 { return "[]" }
let items_parts: [String] = native_list_empty()
let i = 0
while i < n {
let elem = native_list_get(elems, i)
let elem_c: String = js_cg_expr(elem)
let items_parts = native_list_append(items_parts, elem_c)
let i = i + 1
}
return "[" + str_join(items_parts, ", ") + "]"
}
if kind == "Map" {
let pairs = expr["pairs"]
let n: Int = native_list_len(pairs)
if n == 0 { return "{}" }
let items_parts: [String] = native_list_empty()
let i = 0
while i < n {
let pair = native_list_get(pairs, i)
let key: String = pair["key"]
let val = pair["value"]
let val_c: String = js_cg_expr(val)
let items_parts = native_list_append(items_parts, js_str_lit(key) + ": " + val_c)
let i = i + 1
}
return "{" + str_join(items_parts, ", ") + "}"
}
if kind == "Try" {
let inner = expr["inner"]
return js_cg_expr(inner)
}
if kind == "If" {
let cond = expr["cond"]
let cond_c: String = js_cg_expr(cond)
// If as expression: ternary. Body of the if-expression is not
// currently emitted as expression-form for compound bodies; this
// matches the C backend's if-expr stub.
return "(" + cond_c + " ? 1 : 0)"
}
if kind == "Match" {
return js_cg_match(expr)
}
"null"
}
// Match codegen (basic)
//
// Lower a match expression to an IIFE with if/else chain. Works for
// LitInt / LitStr / LitBool / Wildcard / Binding patterns. Tagged-union
// destructuring is not implemented it's stubbed and falls through to
// the wildcard path.
fn js_next_match_id() -> String {
let csv: String = state_get("__js_match_counter")
let n = 0
if !str_eq(csv, "") {
let n = str_to_int(csv)
}
let n = n + 1
state_set("__js_match_counter", native_int_to_str(n))
native_int_to_str(n)
}
fn js_cg_match(expr: Map<String, Any>) -> String {
let subject = expr["subject"]
let arms = expr["arms"]
let subj_c: String = js_cg_expr(subject)
let id: String = js_next_match_id()
let subj_var: String = "_match_subj_" + id
let parts: [String] = native_list_empty()
let parts = native_list_append(parts, "((" + subj_var + ") => { ")
let n: Int = native_list_len(arms)
let i = 0
while i < n {
let arm = native_list_get(arms, i)
let pat = arm["pattern"]
let body = arm["body"]
let pkind: String = pat["pattern"]
let body_c: String = js_cg_expr(body)
if str_eq(pkind, "Wildcard") {
let parts = native_list_append(parts, "return (" + body_c + "); ")
} else {
if str_eq(pkind, "Binding") {
let bname: String = pat["name"]
let parts = native_list_append(parts, "{ const " + bname + " = " + subj_var + "; return (" + body_c + "); } ")
} else {
if str_eq(pkind, "LitInt") {
let v: String = pat["value"]
let parts = native_list_append(parts, "if (" + subj_var + " === " + v + ") return (" + body_c + "); ")
} else {
if str_eq(pkind, "LitStr") {
let v: String = pat["value"]
let parts = native_list_append(parts, "if (str_eq(" + subj_var + ", " + js_str_lit(v) + ")) return (" + body_c + "); ")
} else {
if str_eq(pkind, "LitBool") {
let v: String = pat["value"]
let bv = "false"
if str_eq(v, "true") { let bv = "true" }
let parts = native_list_append(parts, "if (" + subj_var + " === " + bv + ") return (" + body_c + "); ")
} else {
// unknown pattern wildcard
let parts = native_list_append(parts, "return (" + body_c + "); ")
}
}
}
}
}
let i = i + 1
}
let parts = native_list_append(parts, "return null; })(" + subj_c + ")")
str_join(parts, "")
}
// Variable scope tracking
//
// El allows `let x = ...` to redeclare in the same scope. JS would throw
// with `let` (Identifier already declared). We track declared names and
// emit bare `x = ...` on redeclaration, `let x = ...` first time.
fn js_list_contains(lst: [String], s: String) -> Bool {
let n: Int = native_list_len(lst)
let i = 0
while i < n {
let item: String = native_list_get(lst, i)
if item == s { return true }
let i = i + 1
}
false
}
// Statement codegen
fn js_cg_stmt(stmt: Map<String, Any>, indent: String, declared: [String]) -> [String] {
let kind: String = stmt["stmt"]
if kind == "Let" {
let name: String = stmt["name"]
let val = stmt["value"]
let val_c: String = js_cg_expr(val)
let ltype: String = stmt["type"]
if str_eq(ltype, "Int") {
js_add_int_name(name)
}
let vk: String = val["expr"]
if str_eq(vk, "Int") {
js_add_int_name(name)
}
if js_list_contains(declared, name) {
js_emit_line(indent + name + " = " + val_c + ";")
return declared
} else {
// Use `let` (not `const`) El semantics allow rebinding.
js_emit_line(indent + "let " + name + " = " + val_c + ";")
return native_list_append(declared, name)
}
}
if kind == "Return" {
let val = stmt["value"]
let val_kind: String = val["expr"]
if val_kind == "Nil" {
js_emit_line(indent + "return null;")
} else {
let val_c: String = js_cg_expr(val)
js_emit_line(indent + "return " + val_c + ";")
}
return declared
}
// Bare reassignment: `name = expr`. Mirrors the C backend emits a
// plain JS assignment without `let` so we don't shadow an outer binding.
if kind == "Assign" {
let name: String = stmt["name"]
let val = stmt["value"]
let val_c: String = js_cg_expr(val)
js_emit_line(indent + name + " = " + val_c + ";")
return declared
}
if kind == "Expr" {
let val = stmt["value"]
let val_kind: String = val["expr"]
if val_kind == "If" {
js_cg_if_stmt(val, indent, declared)
return declared
}
if val_kind == "For" {
js_cg_for_stmt(val, indent, declared)
return declared
}
let val_c: String = js_cg_expr(val)
js_emit_line(indent + val_c + ";")
return declared
}
if kind == "While" {
let cond = stmt["cond"]
let body = stmt["body"]
let cond_c: String = js_cg_expr(cond)
let cond_c = js_strip_outer_parens(cond_c)
js_emit_line(indent + "while (" + cond_c + ") {")
js_cg_stmts(body, indent + " ", native_list_clone(declared))
js_emit_line(indent + "}")
return declared
}
if kind == "For" {
let item: String = stmt["item"]
let list_expr = stmt["list"]
let body = stmt["body"]
js_cg_for_body(item, list_expr, body, indent, declared)
return declared
}
if kind == "FnDef" { return declared }
if kind == "TypeDef" { return declared }
if kind == "EnumDef" { return declared }
if kind == "Import" { return declared }
if kind == "CgiBlock" {
// CGI blocks compile to a no-op + warning comment in JS target.
// The runtime cgi identity is server-side; UI code is not a CGI
// principal. See spec/codegen-js.md §7.
let cname: String = stmt["name"]
js_emit_line(indent + "// cgi block '" + cname + "' — no-op in JS target (server-side concept)")
return declared
}
if kind == "ServiceBlock" {
let sname: String = stmt["name"]
js_emit_line(indent + "// service block '" + sname + "' — no-op in JS target")
return declared
}
declared
}
// Strip a single layer of surrounding parentheses from a JS expression string.
fn js_strip_outer_parens(s: String) -> String {
let chars: [String] = native_string_chars(s)
let n: Int = native_list_len(chars)
if n < 2 { return s }
let first: String = native_list_get(chars, 0)
let last: String = native_list_get(chars, n - 1)
if first == "(" {
if last == ")" {
let depth = 1
let i = 1
let balanced = true
while i < n - 1 {
let ch: String = native_list_get(chars, i)
if ch == "(" {
let depth = depth + 1
}
if ch == ")" {
let depth = depth - 1
if depth == 0 {
let balanced = false
let i = n
}
}
let i = i + 1
}
if balanced {
return str_slice(s, 1, n - 1)
}
}
}
s
}
fn js_cg_if_stmt(expr: Map<String, Any>, indent: String, declared: [String]) -> Void {
let cond = expr["cond"]
let then_stmts = expr["then"]
let else_stmts = expr["else"]
let has_else: Bool = expr["has_else"]
let cond_c: String = js_cg_expr(cond)
let cond_c = js_strip_outer_parens(cond_c)
js_emit_line(indent + "if (" + cond_c + ") {")
js_cg_stmts(then_stmts, indent + " ", native_list_clone(declared))
if has_else {
js_emit_line(indent + "} else {")
js_cg_stmts(else_stmts, indent + " ", native_list_clone(declared))
}
js_emit_line(indent + "}")
}
fn js_cg_for_body(item: String, list_expr: Map<String, Any>, body: [Map<String, Any>], indent: String, declared: [String]) -> Void {
let list_c: String = js_cg_expr(list_expr)
js_emit_line(indent + "for (const " + item + " of " + list_c + ") {")
let body_decl = native_list_clone(declared)
let body_decl = native_list_append(body_decl, item)
js_cg_stmts(body, indent + " ", body_decl)
js_emit_line(indent + "}")
}
fn js_cg_for_stmt(expr: Map<String, Any>, indent: String, declared: [String]) -> Void {
let item: String = expr["item"]
let list_expr = expr["list"]
let body = expr["body"]
js_cg_for_body(item, list_expr, body, indent, declared)
}
fn js_cg_stmts(stmts: [Map<String, Any>], indent: String, declared: [String]) -> [String] {
let n: Int = native_list_len(stmts)
let i = 0
let decl = declared
while i < n {
let stmt = native_list_get(stmts, i)
let decl = js_cg_stmt(stmt, indent, decl)
let i = i + 1
}
decl
}
// Function declaration codegen
fn js_params_str(params: [Map<String, Any>]) -> String {
let n: Int = native_list_len(params)
if n == 0 { return "" }
let parts: [String] = native_list_empty()
let i = 0
while i < n {
let param = native_list_get(params, i)
let name: String = param["name"]
let parts = native_list_append(parts, name)
let i = i + 1
}
str_join(parts, ", ")
}
// Same implicit-return transform as the C backend.
fn js_transform_implicit_return(body: [Map<String, Any>]) -> [Map<String, Any>] {
let n: Int = native_list_len(body)
if n == 0 { return body }
let last: Map<String, Any> = native_list_get(body, n - 1)
let last_kind: String = last["stmt"]
if last_kind == "Expr" {
let val = last["value"]
let val_kind: String = val["expr"]
if val_kind == "If" { return body }
if val_kind == "For" { return body }
let new_body: [Map<String, Any>] = native_list_empty()
let i = 0
while i < n - 1 {
let new_body = native_list_append(new_body, native_list_get(body, i))
let i = i + 1
}
let return_stmt: Map<String, Any> = { "stmt": "Return", "value": val }
let new_body = native_list_append(new_body, return_stmt)
return new_body
}
body
}
fn js_cg_fn(stmt: Map<String, Any>) -> Void {
let fn_name: String = stmt["name"]
let params = stmt["params"]
let body = stmt["body"]
let ret_type: String = stmt["ret_type"]
let params_str: String = js_params_str(params)
js_build_int_names_for_params(params)
// Special-case `fn main` emit as a regular function and call it
// at module bottom (after all top-level statements). This matches
// the C backend's behavior where `fn main` is the entry point.
if fn_name == "main" {
js_emit_line("function main(" + params_str + ") {")
} else {
js_emit_line("function " + fn_name + "(" + params_str + ") {")
}
let decl = native_list_empty()
let np: Int = native_list_len(params)
let pi = 0
while pi < np {
let param = native_list_get(params, pi)
let pname: String = param["name"]
let decl = native_list_append(decl, pname)
let pi = pi + 1
}
let body_xformed = body
if !str_eq(ret_type, "Void") {
let body_xformed = js_transform_implicit_return(body)
}
js_cg_stmts(body_xformed, " ", decl)
js_emit_line("}")
js_emit_blank()
}
// Top-level codegen
fn js_is_fndef(stmt: Map<String, Any>) -> Bool {
let kind: String = stmt["stmt"]
if kind == "FnDef" { return true }
false
}
fn js_is_top_level_decl(stmt: Map<String, Any>) -> Bool {
let kind: String = stmt["stmt"]
if kind == "TypeDef" { return true }
if kind == "EnumDef" { return true }
if kind == "Import" { return true }
if kind == "CgiBlock" { return true }
if kind == "ServiceBlock" { return true }
false
}
// Entry point
fn codegen_js(stmts: [Map<String, Any>], source: String) -> String {
// Reset per-compile state.
state_set("__js_int_names", "")
state_set("__js_match_counter", "")
// Preamble: inline the runtime via a single import that side-effects
// globalThis. The runtime path is resolved relative to the generated
// output; users running `elc --target=js` are responsible for ensuring
// el_runtime.js is reachable. For self-contained output, the runtime
// could be inlined; that is a follow-up.
js_emit_line("// Generated by elc --target=js")
js_emit_line("// Runtime: foundation/el/el-compiler/runtime/el_runtime.js")
js_emit_line("import \"./el_runtime.js\";")
js_emit_line("const {")
js_emit_line(" println, print, el_str_concat, str_concat, str_eq, str_starts_with, str_ends_with,")
js_emit_line(" str_len, int_to_str, str_to_int, str_slice, str_contains, str_replace,")
js_emit_line(" str_to_upper, str_to_lower, str_trim, str_index_of, str_split, str_char_at,")
js_emit_line(" str_char_code, str_lower, str_upper, el_abs, el_max, el_min,")
js_emit_line(" el_list_new, el_list_len, el_list_get, el_list_append, el_list_empty, el_list_clone,")
js_emit_line(" list_push, list_join, list_range,")
js_emit_line(" el_map_new, el_get_field, el_map_get, el_map_set,")
js_emit_line(" http_get, http_post, http_post_json,")
js_emit_line(" fs_read, fs_write, fs_list,")
js_emit_line(" json_parse, json_stringify, json_get, json_get_string, json_get_int,")
js_emit_line(" time_now, time_now_utc, sleep_ms, bool_to_str, exit_program,")
js_emit_line(" el_retain, el_release,")
js_emit_line(" append, len, get, map_get, map_set,")
js_emit_line(" native_list_get, native_list_len, native_list_append, native_list_empty,")
js_emit_line(" native_list_clone, native_string_chars, native_int_to_str,")
js_emit_line(" args, state_set, state_get, state_del, state_keys, env,")
js_emit_line(" dharma_connect, dharma_send, dharma_emit, dharma_field, dharma_activate,")
js_emit_line(" engram_node, engram_search, engram_activate,")
js_emit_line(" llm_call, llm_call_system,")
js_emit_line("} = globalThis.__el;")
js_emit_blank()
// Function definitions
let n: Int = native_list_len(stmts)
let i = 0
while i < n {
let stmt = native_list_get(stmts, i)
if js_is_fndef(stmt) {
js_cg_fn(stmt)
}
let i = i + 1
}
// Top-level statements (those that are not FnDef and not declarative)
// run at module load. If the program defines `fn main`, we additionally
// call main() at the end so the C-backend mental model of "fn main is
// the entry point" carries over.
let has_main = false
let i = 0
while i < n {
let stmt = native_list_get(stmts, i)
let sk: String = stmt["stmt"]
if str_eq(sk, "FnDef") {
let fn_name: String = stmt["name"]
if str_eq(fn_name, "main") {
let has_main = true
}
}
let i = i + 1
}
let main_decl = native_list_empty()
let i = 0
while i < n {
let stmt = native_list_get(stmts, i)
if js_is_fndef(stmt) {
// skip
} else {
if js_is_top_level_decl(stmt) {
// skip
} else {
let main_decl = js_cg_stmt(stmt, "", main_decl)
}
}
let i = i + 1
}
if has_main {
js_emit_blank()
js_emit_line("main();")
}
// Return empty string output was streamed via println
""
}
+294 -147
View File
@@ -1,4 +1,4 @@
// codegen.el El compiler C source code generator
// codegen.el - El compiler C source code generator
//
// Input: list of AST statement maps (from parser.el)
// Output: C source printed to stdout (streamed, one line at a time)
@@ -7,37 +7,90 @@
// Functions map directly to C functions; top-level statements become main().
//
// Entry point: fn codegen(stmts: [Map<String, Any>], source: String) -> String
// Returns "" output goes to stdout via println().
// Returns "" - output goes to stdout via println().
//
// Streaming output avoids O(n²) string concatenation: each emitted line is
// Streaming output avoids O(n-) string concatenation: each emitted line is
// printed immediately rather than appended to a growing string.
// String helpers
// -- String helpers ------------------------------------------------------------
// Escape a C string literal (double-quotes and backslashes).
// Hex-encode a single nibble (0-15) as a lowercase hex character.
fn nibble_to_hex(n: Int) -> String {
str_char_at("0123456789abcdef", n)
}
// Encode a byte value (0-255) as a two-character hex string.
fn byte_to_hex2(b: Int) -> String {
let hi: Int = (b / 16)
let lo: Int = (b - hi * 16)
nibble_to_hex(hi) + nibble_to_hex(lo)
}
// Return true if the byte value is a C hex digit (0-9, a-f, A-F).
// Used to determine whether a \xNN escape needs a string-literal split
// to prevent the C preprocessor from greedily consuming following hex chars.
fn is_hex_digit_byte(b: Int) -> Bool {
if b >= 48 { if b <= 57 { return true } } // 0-9
if b >= 65 { if b <= 70 { return true } } // A-F
if b >= 97 { if b <= 102 { return true } } // a-f
false
}
fn c_escape(s: String) -> String {
let chars: [String] = native_string_chars(s)
let total: Int = native_list_len(chars)
let out = ""
let i = 0
// Use index-based byte scanning via str_char_code(s, i) and str_char_at(s, i).
// This avoids native_string_chars + str_join, which corrupts high-byte (>= 0x80)
// characters because list_join's looks_like_string heuristic rejects strings
// whose first byte is >= 0x7F and emits them as decimal pointer values instead.
//
// IMPORTANT: after a \xNN hex escape, if the next byte is a hex digit
// (0-9, a-f, A-F), we emit `""` to split the C string literal so the C
// compiler does not greedily read extra hex digits as part of the escape.
// E.g. "\xad" followed by "bamos" must become "\xad" "bamos" because 'b'
// is a hex digit and C would otherwise read "\xadb" (= 0xADB, out of range).
let total: Int = str_len(s)
let parts: [String] = native_list_empty()
let i: Int = 0
let prev_was_hex_escape: Bool = false
while i < total {
let ch: String = native_list_get(chars, i)
if ch == "\"" {
let out = out + "\\\""
let bval: Int = str_char_code(s, i)
// If the previous token was a \xNN escape and the current byte is a
// hex digit, insert an empty string literal ("") to break the escape.
if prev_was_hex_escape {
if is_hex_digit_byte(bval) {
let parts = native_list_append(parts, "\"\"")
}
}
let prev_was_hex_escape = false
if bval == 34 {
// 34 = '"'
let parts = native_list_append(parts, "\\\"")
} else {
if ch == "\\" {
let out = out + "\\\\"
if bval == 92 {
// 92 = '\\'
let parts = native_list_append(parts, "\\\\")
} else {
if ch == "\n" {
let out = out + "\\n"
if bval == 10 {
// 10 = '\n'
let parts = native_list_append(parts, "\\n")
} else {
if ch == "\r" {
let out = out + "\\r"
if bval == 13 {
// 13 = '\r'
let parts = native_list_append(parts, "\\r")
} else {
if ch == "\t" {
let out = out + "\\t"
if bval == 9 {
// 9 = '\t'
let parts = native_list_append(parts, "\\t")
} else {
let out = out + ch
if bval >= 128 {
// Escape non-ASCII bytes (>= 0x80) as \xNN so
// Clang does not misinterpret multi-byte UTF-8
// sequences in C string literals.
let parts = native_list_append(parts, "\\x" + byte_to_hex2(bval))
let prev_was_hex_escape = true
} else {
let parts = native_list_append(parts, str_char_at(s, i))
}
}
}
}
@@ -45,14 +98,14 @@ fn c_escape(s: String) -> String {
}
let i = i + 1
}
out
str_join(parts, "")
}
fn c_str_lit(s: String) -> String {
"\"" + c_escape(s) + "\""
}
// Type mapping
// -- Type mapping --------------------------------------------------------------
fn el_type_to_c(type_str: String) -> String {
if type_str == "String" { return "const char*" }
@@ -64,7 +117,7 @@ fn el_type_to_c(type_str: String) -> String {
"void*"
}
// Code emission
// -- Code emission -------------------------------------------------------------
//
// emit_line/emit_blank stream output directly via println.
// This avoids building a large string in memory.
@@ -77,7 +130,7 @@ fn emit_blank() -> Void {
println("")
}
// Operator helpers
// -- Operator helpers ----------------------------------------------------------
fn binop_to_c(op: String) -> String {
if op == "Plus" { return "+" }
@@ -95,11 +148,11 @@ fn binop_to_c(op: String) -> String {
op
}
// Expression codegen
// -- Expression codegen --------------------------------------------------------
//
// cg_expr returns a C expression string (not a statement).
// duration_unit_nanos multiplier from a postfix-literal unit name to
// duration_unit_nanos - multiplier from a postfix-literal unit name to
// nanoseconds. Singular and plural forms collapse to the same multiplier;
// the parser already restricted `unit` to the set is_duration_unit accepts.
// Returns the multiplier as a decimal string suitable for splicing into
@@ -130,7 +183,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
return v
}
// DurationLit postfix-literal time value (e.g. 30.seconds, 1.hour).
// DurationLit - postfix-literal time value (e.g. 30.seconds, 1.hour).
// Lowered to a literal int64 nanosecond count, wrapped in the runtime
// entry point so the intent is explicit at the C level. The arithmetic
// is fully constant-folded by any optimising C compiler.
@@ -144,7 +197,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
if kind == "Float" {
// Wrap Float literals in el_from_float() so the bit pattern is
// preserved through the el_val_t (int64) slot. Without this,
// implicit doubleint64 conversion in C truncates `0.8` to `0`
// implicit double->int64 conversion in C truncates `0.8` to `0`
// when passed to a builtin that expects el_val_t.
let v: String = expr["value"]
return "el_from_float(" + v + ")"
@@ -191,7 +244,26 @@ fn cg_expr(expr: Map<String, Any>) -> String {
let left_kind: String = left["expr"]
let right_kind: String = right["expr"]
// Temporal-type dispatch (Instant + Duration first-class)
// -- String/equality fast-path: skip O(N-) temporal traversals --------
// The 10 temporal predicates below each recurse into the left subtree:
// O(depth) state_get calls per predicate, O(N-) total for a chain of N
// string-concat BinOps (e.g. the 70-100-part HTML chains in soul.el).
// When either operand is a bare Str literal the result is always concat
// or str_eq - no temporal dispatch is possible. Exit immediately.
if str_eq(op, "Plus") {
if str_eq(left_kind, "Str") { return "el_str_concat(" + left_c + ", " + right_c + ")" }
if str_eq(right_kind, "Str") { return "el_str_concat(" + left_c + ", " + right_c + ")" }
}
if str_eq(op, "EqEq") {
if str_eq(left_kind, "Str") { return "str_eq(" + left_c + ", " + right_c + ")" }
if str_eq(right_kind, "Str") { return "str_eq(" + left_c + ", " + right_c + ")" }
}
if str_eq(op, "NotEq") {
if str_eq(left_kind, "Str") { return "!str_eq(" + left_c + ", " + right_c + ")" }
if str_eq(right_kind, "Str") { return "!str_eq(" + left_c + ", " + right_c + ")" }
}
// -- Temporal-type dispatch (Instant + Duration first-class) --------
// Run BEFORE the int / string / generic paths so typed temporal
// operands route through the runtime wrappers and invalid combos
// become #error directives rather than silently falling through to
@@ -377,7 +449,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
if right_is_dur { return "el_duration_ne(" + left_c + ", " + right_c + ")" }
}
}
// Fall through let the existing path handle anything we
// Fall through - let the existing path handle anything we
// didn't explicitly cover (typically string-concat with a
// typed temporal value, e.g. for debug prints, which works
// because both share the int64 slot).
@@ -396,7 +468,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
// builtin, or BinOp arithmetic over Ints) participates in
// arithmetic, not string concat. Recursion into BinOp lets
// `a + b + c` (chained Int adds) and `acc * 16 + d` route to
// arithmetic instead of falling to el_str_concat both sides
// arithmetic instead of falling to el_str_concat - both sides
// are Int so the outer `+` is too.
if is_int_expr(left) {
if is_int_expr(right) {
@@ -417,7 +489,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
return "(" + left_c + " " + op_c + " " + right_c + ")"
}
// Otherwise: BinOp(+) with a Call/Ident side without int-typed
// evidence fall back to string concat (the historical default).
// evidence - fall back to string concat (the historical default).
if left_kind == "Call" {
return "el_str_concat(" + left_c + ", " + right_c + ")"
}
@@ -449,7 +521,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
// identifiers tracked in __int_names (typed Int via `let x: Int = ...`).
// Without the int-name check, `seen == idx` between two Int locals
// miscompiles to str_eq(seen, idx), strcmp'ing what are integer values
// dressed as char* segfault on the first non-printable byte.
// dressed as char* - segfault on the first non-printable byte.
if op == "EqEq" {
if left_kind == "Int" {
return "(" + left_c + " == " + right_c + ")"
@@ -565,17 +637,15 @@ fn cg_expr(expr: Map<String, Any>) -> String {
let arity: Int = native_list_len(args)
let func_kind: String = func["expr"]
let args_c = ""
let args_parts: [String] = native_list_empty()
let i = 0
while i < arity {
let arg = native_list_get(args, i)
let arg_c: String = cg_expr(arg)
if i > 0 {
let args_c = args_c + ", "
}
let args_c = args_c + arg_c
let args_parts = native_list_append(args_parts, arg_c)
let i = i + 1
}
let args_c: String = str_join(args_parts, ", ")
if func_kind == "Ident" {
let fn_name: String = func["name"]
@@ -585,17 +655,17 @@ fn cg_expr(expr: Map<String, Any>) -> String {
// violations to be emitted as #error directives at the
// top of the generated C, so cc fails with a clear msg.
cap_check_call(fn_name)
// Arity check against the builtin table refuse, with a clear
// Arity check against the builtin table - refuse, with a clear
// El-source message, when a known builtin gets the wrong arg
// count (e.g. `http_serve(port)` instead of `http_serve(port,
// handler)`). User-defined fns and variadic builtins pass
// through (builtin_arity returns -1).
arity_check_call(fn_name, arity)
// sleep(Duration) Phase 1 of the typed-time work. When the
// sleep(Duration) - Phase 1 of the typed-time work. When the
// single arg is provably a Duration we lower to el_sleep_duration
// so the runtime sees nanos directly. Existing sleep() callers
// that pass an Int still emit `sleep(<int>)`, which falls through
// to the no-such-symbol path those call sites must migrate to
// to the no-such-symbol path - those call sites must migrate to
// a typed Duration. Acceptable: the spec marks them out for an
// audit pass during Phase 1.
if str_eq(fn_name, "sleep") {
@@ -606,6 +676,20 @@ fn cg_expr(expr: Map<String, Any>) -> String {
}
}
}
// el_from_float takes a raw C double - do not wrap the float
// argument in el_from_float() again. Without this, the float
// literal codegen (which wraps every Float in el_from_float())
// produces el_from_float(el_from_float(0.7)) - double-encoded.
if str_eq(fn_name, "el_from_float") {
if arity == 1 {
let only_arg = native_list_get(args, 0)
let arg_kind: String = only_arg["expr"]
if str_eq(arg_kind, "Float") {
let v: String = only_arg["value"]
return "el_from_float(" + v + ")"
}
}
}
return fn_name + "(" + args_c + ")"
}
@@ -639,8 +723,8 @@ fn cg_expr(expr: Map<String, Any>) -> String {
// El programs use `t["field"]` for map access and `arr[i]` for
// list access. The parser emits the same Index node for both.
// Dispatch at codegen time on the index expression kind: string-
// literal index map field access (`el_get_field`); anything
// else list element access (`el_list_get`).
// literal index -> map field access (`el_get_field`); anything
// else -> list element access (`el_list_get`).
let obj = expr["object"]
let idx = expr["index"]
let obj_c: String = cg_expr(obj)
@@ -658,18 +742,15 @@ fn cg_expr(expr: Map<String, Any>) -> String {
// Empty literal: el_list_new(0, ) generates malformed C (trailing
// comma in a varargs call). Emit el_list_empty() directly.
if n == 0 { return "el_list_empty()" }
let items = ""
let items_parts: [String] = native_list_empty()
let i = 0
while i < n {
let elem = native_list_get(elems, i)
let elem_c: String = cg_expr(elem)
if i > 0 {
let items = items + ", "
}
let items = items + elem_c
let items_parts = native_list_append(items_parts, elem_c)
let i = i + 1
}
return "el_list_new(" + native_int_to_str(n) + ", " + items + ")"
return "el_list_new(" + native_int_to_str(n) + ", " + str_join(items_parts, ", ") + ")"
}
if kind == "Map" {
@@ -677,23 +758,20 @@ fn cg_expr(expr: Map<String, Any>) -> String {
let n: Int = native_list_len(pairs)
// Empty literal: `el_map_new(0, )` is malformed C (trailing comma in
// a varargs call). Emit `el_map_new(0)` directly so empty-map
// shadowing inside for/while/if bodies `let acc: Map = {}`
// shadowing inside for/while/if bodies - `let acc: Map = {}` -
// doesn't fail downstream cc with parse errors.
if n == 0 { return "el_map_new(0)" }
let items = ""
let items_parts: [String] = native_list_empty()
let i = 0
while i < n {
let pair = native_list_get(pairs, i)
let key: String = pair["key"]
let val = pair["value"]
let val_c: String = cg_expr(val)
if i > 0 {
let items = items + ", "
}
let items = items + c_str_lit(key) + ", " + val_c
let items_parts = native_list_append(items_parts, c_str_lit(key) + ", " + val_c)
let i = i + 1
}
return "el_map_new(" + native_int_to_str(n) + ", " + items + ")"
return "el_map_new(" + native_int_to_str(n) + ", " + str_join(items_parts, ", ") + ")"
}
if kind == "Try" {
@@ -712,7 +790,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
"EL_NULL"
}
// Match codegen
// -- Match codegen -------------------------------------------------------------
//
// Lower a match expression to a GCC/Clang statement-expression.
// A unique label suffix is allocated per match via state_set("__match_counter").
@@ -736,7 +814,9 @@ fn cg_match(expr: Map<String, Any>) -> String {
let subj_var: String = "_match_subj_" + id
let result_var: String = "_match_result_" + id
let done_label: String = "_match_done_" + id
let out: String = "({ el_val_t " + subj_var + " = " + subj_c + "; el_val_t " + result_var + " = 0; "
// Accumulate arm fragments into a list to avoid O(n-) string growth.
let parts: [String] = native_list_empty()
let parts = native_list_append(parts, "({ el_val_t " + subj_var + " = " + subj_c + "; el_val_t " + result_var + " = 0; ")
let n: Int = native_list_len(arms)
let i = 0
while i < n {
@@ -746,19 +826,19 @@ fn cg_match(expr: Map<String, Any>) -> String {
let pkind: String = pat["pattern"]
let body_c: String = cg_expr(body)
if str_eq(pkind, "Wildcard") {
let out = out + "{ " + result_var + " = (" + body_c + "); goto " + done_label + "; } "
let parts = native_list_append(parts, "{ " + result_var + " = (" + body_c + "); goto " + done_label + "; } ")
} else {
if str_eq(pkind, "Binding") {
let bname: String = pat["name"]
let out = out + "{ el_val_t " + bname + " = " + subj_var + "; " + result_var + " = (" + body_c + "); goto " + done_label + "; } "
let parts = native_list_append(parts, "{ el_val_t " + bname + " = " + subj_var + "; " + result_var + " = (" + body_c + "); goto " + done_label + "; } ")
} else {
if str_eq(pkind, "LitInt") {
let v: String = pat["value"]
let out = out + "if (" + subj_var + " == " + v + ") { " + result_var + " = (" + body_c + "); goto " + done_label + "; } "
let parts = native_list_append(parts, "if (" + subj_var + " == " + v + ") { " + result_var + " = (" + body_c + "); goto " + done_label + "; } ")
} else {
if str_eq(pkind, "LitStr") {
let v: String = pat["value"]
let out = out + "if (str_eq(" + subj_var + ", EL_STR(" + c_str_lit(v) + "))) { " + result_var + " = (" + body_c + "); goto " + done_label + "; } "
let parts = native_list_append(parts, "if (str_eq(" + subj_var + ", EL_STR(" + c_str_lit(v) + "))) { " + result_var + " = (" + body_c + "); goto " + done_label + "; } ")
} else {
if str_eq(pkind, "LitBool") {
let v: String = pat["value"]
@@ -766,10 +846,10 @@ fn cg_match(expr: Map<String, Any>) -> String {
if str_eq(v, "true") {
let bv = "1"
}
let out = out + "if (" + subj_var + " == " + bv + ") { " + result_var + " = (" + body_c + "); goto " + done_label + "; } "
let parts = native_list_append(parts, "if (" + subj_var + " == " + bv + ") { " + result_var + " = (" + body_c + "); goto " + done_label + "; } ")
} else {
// unknown pattern wildcard
let out = out + "{ " + result_var + " = (" + body_c + "); goto " + done_label + "; } "
// unknown pattern -> wildcard
let parts = native_list_append(parts, "{ " + result_var + " = (" + body_c + "); goto " + done_label + "; } ")
}
}
}
@@ -777,11 +857,11 @@ fn cg_match(expr: Map<String, Any>) -> String {
}
let i = i + 1
}
let out = out + done_label + ":; " + result_var + "; })"
out
let parts = native_list_append(parts, done_label + ":; " + result_var + "; })")
str_join(parts, "")
}
// If-as-expression codegen
// -- If-as-expression codegen -------------------------------------------------
//
// Lower `if cond { thenBody } else { elseBody }` used in expression position
// (e.g. `let x = if a { b } else { c }`) to a GCC/Clang statement-expression
@@ -809,7 +889,8 @@ fn next_if_id() -> String {
// result var stays at its initial 0.
fn cg_if_expr_arm(stmts: [Map<String, Any>], result_var: String) -> String {
let n: Int = native_list_len(stmts)
let out = ""
// Collect statement fragments into a list to avoid O(n-) string growth.
let parts: [String] = native_list_empty()
let i = 0
while i < n {
let s = native_list_get(stmts, i)
@@ -820,31 +901,31 @@ fn cg_if_expr_arm(stmts: [Map<String, Any>], result_var: String) -> String {
let name: String = s["name"]
let val = s["value"]
let val_c: String = cg_expr(val)
let out = out + "el_val_t " + name + " = " + val_c + "; "
let parts = native_list_append(parts, "el_val_t " + name + " = " + val_c + "; ")
} else {
if str_eq(sk, "Return") {
let val = s["value"]
let val_c: String = cg_expr(val)
let out = out + result_var + " = (" + val_c + "); "
let parts = native_list_append(parts, result_var + " = (" + val_c + "); ")
} else {
if str_eq(sk, "Expr") {
let val = s["value"]
let val_c: String = cg_expr(val)
if is_last {
let out = out + result_var + " = (" + val_c + "); "
let parts = native_list_append(parts, result_var + " = (" + val_c + "); ")
} else {
let out = out + "(void)(" + val_c + "); "
let parts = native_list_append(parts, "(void)(" + val_c + "); ")
}
} else {
if str_eq(sk, "Assign") {
// Real reassignment in an expression-position arm
// Real reassignment in an expression-position arm -
// emit the store; the arm's "value" stays whatever
// result_var was last set to, which is the El
// semantics (assignment is a statement, not a value).
let aname: String = s["name"]
let aval = s["value"]
let aval_c: String = cg_expr(aval)
let out = out + aname + " = " + aval_c + "; "
let parts = native_list_append(parts, aname + " = " + aval_c + "; ")
} else {
// Non-trivial stmt kinds (While/For) shouldn't appear in
// expression-position arm bodies; emit nothing rather
@@ -855,7 +936,7 @@ fn cg_if_expr_arm(stmts: [Map<String, Any>], result_var: String) -> String {
}
let i = i + 1
}
out
str_join(parts, "")
}
fn cg_if_expr(expr: Map<String, Any>) -> String {
@@ -875,7 +956,7 @@ fn cg_if_expr(expr: Map<String, Any>) -> String {
out
}
// Variable scope tracking
// -- Variable scope tracking ---------------------------------------------------
//
// El allows `let x = expr` to both declare and reassign x in the same scope.
// C doesn't allow redeclaring the same name in the same block.
@@ -894,7 +975,7 @@ fn list_contains(lst: [String], s: String) -> Bool {
false
}
// Statement codegen
// -- Statement codegen ---------------------------------------------------------
//
// cg_stmt emits C lines via println. declared is a list of already-declared
// variable names in the current C scope; returns updated declared list.
@@ -943,7 +1024,7 @@ fn cg_stmt(stmt: Map<String, Any>, indent: String, declared: [String]) -> [Strin
if str_eq(ltype, "Zone") {
add_zone_name(name)
}
// Inference from RHS duration literals and known-typed calls
// Inference from RHS - duration literals and known-typed calls
// propagate even when the let is unannotated.
if is_instant_expr(val) {
add_instant_name(name)
@@ -998,7 +1079,7 @@ fn cg_stmt(stmt: Map<String, Any>, indent: String, declared: [String]) -> [Strin
}
// Bare reassignment: `name = expr`. Always emits a plain C assignment
// (no `el_val_t` prefix) by construction the parser only produces
// (no `el_val_t` prefix) - by construction the parser only produces
// Assign for an existing identifier. If the name happens NOT to be in
// `declared` for the current C scope (it was let-bound by an enclosing
// block) the emit still resolves at C level because the variable lives
@@ -1033,7 +1114,7 @@ fn cg_stmt(stmt: Map<String, Any>, indent: String, declared: [String]) -> [Strin
let cond_c: String = cg_expr(cond)
let cond_c = strip_outer_parens(cond_c)
emit_line(indent + "while (" + cond_c + ") {")
// Body lives in its own C block clone so let-bindings inside the
// Body lives in its own C block - clone so let-bindings inside the
// loop don't leak into the parent's `declared` list (which would make
// a sibling scope's `let x` emit assignment on an undeclared name).
cg_stmts(body, indent + " ", native_list_clone(declared))
@@ -1053,6 +1134,7 @@ fn cg_stmt(stmt: Map<String, Any>, indent: String, declared: [String]) -> [Strin
if kind == "TypeDef" { return declared }
if kind == "EnumDef" { return declared }
if kind == "Import" { return declared }
if kind == "ExternFn" { return declared }
if kind == "CgiBlock" { return declared }
declared
}
@@ -1084,14 +1166,7 @@ fn strip_outer_parens(s: String) -> String {
let i = i + 1
}
if balanced {
let inner = ""
let j = 1
while j < n - 1 {
let ch: String = native_list_get(chars, j)
let inner = inner + ch
let j = j + 1
}
return inner
return str_slice(s, 1, n - 1)
}
}
}
@@ -1106,7 +1181,7 @@ fn cg_if_stmt(expr: Map<String, Any>, indent: String, declared: [String]) -> Voi
let cond_c: String = cg_expr(cond)
let cond_c = strip_outer_parens(cond_c)
emit_line(indent + "if (" + cond_c + ") {")
// Each branch gets its own clone of `declared` variables let-bound
// Each branch gets its own clone of `declared` - variables let-bound
// inside the then/else block live only in that C scope, and must not
// leak back to the parent (or to the sibling branch) through shared
// list mutation. Cheap shallow copy; the entries (variable name strings)
@@ -1158,7 +1233,7 @@ fn cg_stmts(stmts: [Map<String, Any>], indent: String, declared: [String]) -> [S
decl
}
// Function declaration codegen
// -- Function declaration codegen -----------------------------------------------
fn param_decl(param: Map<String, Any>, idx: Int) -> String {
let name: String = param["name"]
@@ -1168,18 +1243,15 @@ fn param_decl(param: Map<String, Any>, idx: Int) -> String {
fn params_to_c(params: [Map<String, Any>]) -> String {
let n: Int = native_list_len(params)
if n == 0 { return "void" }
let out = ""
let parts: [String] = native_list_empty()
let i = 0
while i < n {
let param = native_list_get(params, i)
let decl: String = param_decl(param, i)
if i > 0 {
let out = out + ", "
}
let out = out + decl
let parts = native_list_append(parts, decl)
let i = i + 1
}
out
str_join(parts, ", ")
}
// Transform a function body so that an implicit-return final expression
@@ -1230,7 +1302,7 @@ fn is_int_name(name: String) -> Bool {
// Same shape as is_int_name, for Instant- and Duration-typed bindings.
// Used by the BinOp/comparison codegen to dispatch arithmetic through the
// typed runtime wrappers (el_instant_add_dur, el_duration_lt, ) and to
// typed runtime wrappers (el_instant_add_dur, el_duration_lt, -) and to
// surface mismatches (Instant + Instant, Duration + Int) as #error
// directives at the top of the generated C.
fn is_instant_name(name: String) -> Bool {
@@ -1292,7 +1364,7 @@ fn is_int_call(call_expr: Map<String, Any>) -> Bool {
}
// Builtins that return an Instant. Used by is_instant_expr and the BinOp
// dispatch `now() + 5.seconds` types as Instant only because we can see
// dispatch - `now() + 5.seconds` types as Instant only because we can see
// that now() is an Instant-returning Call.
fn is_instant_call(call_expr: Map<String, Any>) -> Bool {
let func = call_expr["func"]
@@ -1328,7 +1400,7 @@ fn is_duration_call(call_expr: Map<String, Any>) -> Bool {
return false
}
// Phase 1.5 Calendar / CalendarTime / Rhythm / LocalDate / LocalTime /
// Phase 1.5 - Calendar / CalendarTime / Rhythm / LocalDate / LocalTime /
// LocalDateTime / Zone are first-class boxed types. Each has its own name
// set in process state, populated from typed `let` bindings and parameter
// annotations. The BinOp dispatcher consults these to forbid mismatched
@@ -1516,7 +1588,7 @@ fn is_zone_expr(expr: Map<String, Any>) -> Bool {
// Recursive type predicates for Instant / Duration. Mirror is_int_expr.
// is_instant_expr / is_duration_expr return true only when the expression
// is provably of that type at codegen time. Anything ambiguous returns
// false the BinOp dispatcher then leaves the expression on the
// false - the BinOp dispatcher then leaves the expression on the
// untyped-int path, which is the safest fallback because at the runtime
// level all three types share the int64 slot.
fn is_instant_expr(expr: Map<String, Any>) -> Bool {
@@ -1531,8 +1603,8 @@ fn is_instant_expr(expr: Map<String, Any>) -> Bool {
if str_eq(k, "BinOp") {
let op: String = expr["op"]
if str_eq(op, "Plus") {
// Instant + Duration Instant
// Duration + Instant Instant
// Instant + Duration -> Instant
// Duration + Instant -> Instant
if is_instant_expr(expr["left"]) {
if is_duration_expr(expr["right"]) { return true }
}
@@ -1542,7 +1614,7 @@ fn is_instant_expr(expr: Map<String, Any>) -> Bool {
return false
}
if str_eq(op, "Minus") {
// Instant - Duration Instant
// Instant - Duration -> Instant
if is_instant_expr(expr["left"]) {
if is_duration_expr(expr["right"]) { return true }
}
@@ -1569,15 +1641,15 @@ fn is_duration_expr(expr: Map<String, Any>) -> Bool {
if str_eq(k, "BinOp") {
let op: String = expr["op"]
if str_eq(op, "Plus") {
// Duration + Duration Duration
// Duration + Duration -> Duration
if is_duration_expr(expr["left"]) {
if is_duration_expr(expr["right"]) { return true }
}
return false
}
if str_eq(op, "Minus") {
// Duration - Duration Duration
// Instant - Instant Duration (caught here, not in is_instant_expr)
// Duration - Duration -> Duration
// Instant - Instant -> Duration (caught here, not in is_instant_expr)
if is_duration_expr(expr["left"]) {
if is_duration_expr(expr["right"]) { return true }
}
@@ -1587,8 +1659,8 @@ fn is_duration_expr(expr: Map<String, Any>) -> Bool {
return false
}
if str_eq(op, "Star") {
// Duration * Int Duration
// Int * Duration Duration
// Duration * Int -> Duration
// Int * Duration -> Duration
if is_duration_expr(expr["left"]) {
if is_int_expr(expr["right"]) { return true }
}
@@ -1598,7 +1670,7 @@ fn is_duration_expr(expr: Map<String, Any>) -> Bool {
return false
}
if str_eq(op, "Slash") {
// Duration / Int Duration
// Duration / Int -> Duration
if is_duration_expr(expr["left"]) {
if is_int_expr(expr["right"]) { return true }
}
@@ -1629,13 +1701,13 @@ fn time_record_violation(kind: String, detail: String) -> Bool {
// the outer dispatch only checks the immediate kind, not the inner.
//
// Rules:
// Int literal Int
// Ident in __int_names Int
// Call to known-Int builtin Int
// Neg of Int Int
// BinOp arithmetic of two Ints Int (Plus, Minus, Star, Slash, Percent)
// BinOp comparison/logical Int (yields 0/1; safe to treat as Int)
// anything else not provably Int
// Int literal -> Int
// Ident in __int_names -> Int
// Call to known-Int builtin -> Int
// Neg of Int -> Int
// BinOp arithmetic of two Ints -> Int (Plus, Minus, Star, Slash, Percent)
// BinOp comparison/logical -> Int (yields 0/1; safe to treat as Int)
// anything else -> not provably Int
fn is_int_expr(expr: Map<String, Any>) -> Bool {
let k: String = expr["expr"]
if str_eq(k, "Int") { return true }
@@ -1654,7 +1726,7 @@ fn is_int_expr(expr: Map<String, Any>) -> Bool {
}
if str_eq(k, "BinOp") {
let op: String = expr["op"]
// Comparisons and logicals always yield 0/1 safe Int.
// Comparisons and logicals always yield 0/1 - safe Int.
if str_eq(op, "EqEq") { return true }
if str_eq(op, "NotEq") { return true }
if str_eq(op, "Lt") { return true }
@@ -1663,7 +1735,7 @@ fn is_int_expr(expr: Map<String, Any>) -> Bool {
if str_eq(op, "GtEq") { return true }
if str_eq(op, "And") { return true }
if str_eq(op, "Or") { return true }
// Arithmetic propagates: Int op Int Int.
// Arithmetic propagates: Int op Int -> Int.
if str_eq(op, "Plus") {
if is_int_expr(expr["left"]) {
if is_int_expr(expr["right"]) { return true }
@@ -1693,7 +1765,7 @@ fn is_int_expr(expr: Map<String, Any>) -> Bool {
return false
}
// Capability-kind enforcement
// -- Capability-kind enforcement ----------------------------------------------
//
// A program's top-level block (cgi / service / none) determines which
// runtime primitives it may call. The compiler records violations in
@@ -1702,11 +1774,11 @@ fn is_int_expr(expr: Map<String, Any>) -> Bool {
// downstream cc step fails with a clear message.
//
// Capability tiers:
// "cgi" full self-formation. All primitives.
// "service" bounded. Cannot call self-formation primitives:
// "cgi" - full self-formation. All primitives.
// "service" - bounded. Cannot call self-formation primitives:
// llm_call_agentic, llm_register_tool, dharma_emit,
// dharma_field. Single-turn LLM calls are allowed.
// "utility" default. No DHARMA, no LLM. Pure compute + I/O.
// "utility" - default. No DHARMA, no LLM. Pure compute + I/O.
//
// The compiler-level rule is structural: the binary either CAN or CANNOT
// emit the call. There is no runtime check, no opt-in, no override.
@@ -1721,7 +1793,7 @@ fn cap_record_violation(kind: String, fn_name: String) -> Bool {
return true
}
// Self-formation primitives the cut between CGI and service. A program
// Self-formation primitives - the cut between CGI and service. A program
// that emits these calls IS structurally a CGI; we forbid them everywhere
// else.
fn is_self_formation_call(fn_name: String) -> Bool {
@@ -1732,7 +1804,7 @@ fn is_self_formation_call(fn_name: String) -> Bool {
return false
}
// Any DHARMA primitive utilities have zero network presence.
// Any DHARMA primitive - utilities have zero network presence.
fn is_dharma_call(fn_name: String) -> Bool {
if str_eq(fn_name, "dharma_connect") { return true }
if str_eq(fn_name, "dharma_send") { return true }
@@ -1745,7 +1817,7 @@ fn is_dharma_call(fn_name: String) -> Bool {
return false
}
// Any LLM primitive utilities have no LLM access at all.
// Any LLM primitive - utilities have no LLM access at all.
fn is_llm_call(fn_name: String) -> Bool {
if str_eq(fn_name, "llm_call") { return true }
if str_eq(fn_name, "llm_call_system") { return true }
@@ -1795,14 +1867,14 @@ fn emit_cap_violations() -> Void {
if colon > 0 {
let kind: String = str_slice(entry, 0, colon)
let fn_name: String = str_slice(entry, colon + 1, str_len(entry))
emit_line("#error \"capability violation: '" + kind + "' programs may not call '" + fn_name + "' (self-formation primitive only 'cgi' programs may use it)\"")
emit_line("#error \"capability violation: '" + kind + "' programs may not call '" + fn_name + "' (self-formation primitive - only 'cgi' programs may use it)\"")
}
let i = i + next_comma + 1
}
}
// Surface temporal-type violations as #error directives. The cg_expr BinOp
// dispatcher records each violation (Instant + Instant, Duration + Int, )
// dispatcher records each violation (Instant + Instant, Duration + Int, -)
// as a CSV entry "kind:detail" via time_record_violation. Each entry maps
// to a single #error so downstream cc fails the build with a clear El-
// source-level message before the bogus C even links.
@@ -1825,7 +1897,7 @@ fn emit_time_violations() -> Void {
}
}
// Builtin arity table
// -- Builtin arity table -------------------------------------------------------
//
// El programs sometimes call runtime builtins with the wrong number of
// arguments (e.g. `http_serve(port)` instead of `http_serve(port, handler)`).
@@ -1835,7 +1907,7 @@ fn emit_time_violations() -> Void {
//
// Strategy: a small static table mirrors el_runtime.h. Variadic builtins
// (el_list_new, el_map_new, args) and unknown identifiers (user fns,
// dynamic dispatch) return -1 no check. A mismatch records a violation
// dynamic dispatch) return -1 -> no check. A mismatch records a violation
// in process state, which emit_arity_violations() turns into #error
// directives at the top of the generated C.
fn builtin_arity(name: String) -> Int {
@@ -2039,7 +2111,7 @@ fn builtin_arity(name: String) -> Int {
if str_eq(name, "get") { return 2 }
if str_eq(name, "map_get") { return 2 }
if str_eq(name, "map_set") { return 3 }
// -1 sentinel: variadic / unknown / user-defined no check.
// -1 sentinel: variadic / unknown / user-defined -> no check.
return -1
}
@@ -2237,7 +2309,7 @@ fn build_int_names_for_params(params: [Map<String, Any>]) -> Bool {
fn cg_fn(stmt: Map<String, Any>) -> Void {
let fn_name: String = stmt["name"]
// Skip El's `fn main()` C provides its own main() for top-level stmts
// Skip El's `fn main()` - C provides its own main() for top-level stmts
// and a duplicate `el_val_t main(void)` would collide with it.
if fn_name == "main" { return }
let params = stmt["params"]
@@ -2269,8 +2341,8 @@ fn cg_fn(stmt: Map<String, Any>) -> Void {
}
// Lift the final bare expression into an explicit return so implicit
// returns ("fn lex(s) { ... tokens }") actually return their value.
// Void-returning functions skip this wrapping `println(x)` in
// `return ` is a C type error.
// Void-returning functions skip this - wrapping `println(x)` in
// `return -` is a C type error.
let body_xformed = body
if !str_eq(ret_type, "Void") {
let body_xformed = transform_implicit_return(body)
@@ -2281,7 +2353,7 @@ fn cg_fn(stmt: Map<String, Any>) -> Void {
emit_blank()
}
// Top-level codegen
// -- Top-level codegen ---------------------------------------------------------
fn is_fndef(stmt: Map<String, Any>) -> Bool {
let kind: String = stmt["stmt"]
@@ -2295,6 +2367,7 @@ fn is_top_level_decl(stmt: Map<String, Any>) -> Bool {
if kind == "EnumDef" { return true }
if kind == "Import" { return true }
if kind == "CgiBlock" { return true }
if kind == "ExternFn" { return true }
false
}
@@ -2306,7 +2379,7 @@ fn cgi_arg(value: String, has_value: Bool) -> String {
return "EL_NULL"
}
// VBD role enforcement
// -- VBD role enforcement ------------------------------------------------------
//
// Scan a function body for direct calls to DHARMA-restricted builtins
// (dharma_emit, dharma_field). These may only appear inside @manager fns.
@@ -2439,16 +2512,16 @@ fn vbd_has_restricted_call(stmts: [Map<String, Any>]) -> Bool {
false
}
// Entry point
// -- Entry point ----------------------------------------------------------------
fn codegen(stmts: [Map<String, Any>], source: String) -> String {
// Detect cgi/service blocks: at most one declarative top-level block.
// The block determines the program's CAPABILITY KIND:
// "cgi" full self-formation. Calls all primitives.
// "service" bounded. Cannot call self-formation primitives
// "cgi" - full self-formation. Calls all primitives.
// "service" - bounded. Cannot call self-formation primitives
// (llm_call_agentic, llm_register_tool, dharma_emit,
// dharma_field, mindlink-creation).
// "utility" default; no DHARMA membership, no LLM, no agentic.
// "utility" - default; no DHARMA membership, no LLM, no agentic.
// Codegen enforces this with #error directives at every restricted
// call site. The capability boundary is structural: a binary either
// CAN or CANNOT do a thing, and the compiler decides at emission time.
@@ -2483,7 +2556,7 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
}
if cgi_count >= 1 {
if svc_count >= 1 {
emit_line("#error \"El: program declares both cgi and service blocks (mutually exclusive pick one)\"")
emit_line("#error \"El: program declares both cgi and service blocks (mutually exclusive - pick one)\"")
}
}
// Stash the program kind so cg_expr's Call branch can enforce
@@ -2503,9 +2576,44 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
emit_line("#include <stdint.h>")
emit_line("#include <stdlib.h>")
emit_line("#include \"el_runtime.h\"")
// Cross-module forward declarations: for each imported module, emit
// #include "module.elh" so Clang sees the function signatures from
// that module without needing the full source inlined. The .elh files
// are generated by `elc --emit-header` and live in the same dist/
// directory as the generated .c files. We use basename only (strip
// the directory prefix and .el extension) so the include resolves
// correctly regardless of the source tree layout.
let imp_n: Int = native_list_len(stmts)
let imp_i = 0
while imp_i < imp_n {
let imp_stmt = native_list_get(stmts, imp_i)
let imp_kind: String = imp_stmt["stmt"]
if str_eq(imp_kind, "Import") {
let imp_path: String = imp_stmt["path"]
// Extract basename: find last '/' and strip from there.
let imp_path_len: Int = str_len(imp_path)
let imp_last_slash: Int = -1
let imp_j: Int = 0
while imp_j < imp_path_len {
let imp_c: String = str_slice(imp_path, imp_j, imp_j + 1)
if str_eq(imp_c, "/") { let imp_last_slash = imp_j }
let imp_j = imp_j + 1
}
let imp_base: String = str_slice(imp_path, imp_last_slash + 1, imp_path_len)
// Strip .el extension if present.
let imp_base_len: Int = str_len(imp_base)
let imp_bname: String = imp_base
if str_ends_with(imp_base, ".el") {
let imp_bname = str_slice(imp_base, 0, imp_base_len - 3)
}
emit_line("#include \"" + imp_bname + ".elh\"")
}
let imp_i = imp_i + 1
}
emit_blank()
// Forward declarations (skip `main` C provides its own)
// Forward declarations (skip `main` - C provides its own)
let n: Int = native_list_len(stmts)
let i = 0
while i < n {
@@ -2519,11 +2627,17 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
emit_line("el_val_t " + fn_name + "(" + params_c + ");")
}
}
if kind == "ExternFn" {
let fn_name: String = stmt["name"]
let params = stmt["params"]
let params_c: String = params_to_c(params)
emit_line("el_val_t " + fn_name + "(" + params_c + ");")
}
let i = i + 1
}
emit_blank()
// Top-level `let` bindings file-scope storage. El programs use
// Top-level `let` bindings -> file-scope storage. El programs use
// top-level `let GREETING = "..."` as module constants that any
// function below should be able to read. Without this pass, a top-
// level Let only declares the name inside main()'s scope and any
@@ -2557,6 +2671,36 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
emit_blank()
}
// Detect whether this compilation unit has an entry point.
// A unit is a library (no C main emitted) when there is no fn main()
// and no top-level executable statements. This supports separate
// compilation: library .c files contain only function definitions.
let has_el_main = false
let has_toplevel_stmts = false
let i = 0
while i < n {
let stmt = native_list_get(stmts, i)
let sk: String = stmt["stmt"]
if str_eq(sk, "FnDef") {
let fn_name_chk: String = stmt["name"]
if str_eq(fn_name_chk, "main") { let has_el_main = true }
}
if !is_fndef(stmt) {
if !is_top_level_decl(stmt) {
if !str_eq(sk, "Let") {
let has_toplevel_stmts = true
}
}
}
let i = i + 1
}
let is_library = false
if !has_el_main {
if !has_toplevel_stmts {
let is_library = true
}
}
// Function definitions
let i = 0
while i < n {
@@ -2567,6 +2711,9 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
let i = i + 1
}
// Skip C main() for library units (no fn main, no top-level stmts)
if is_library { return "" }
// main(). Use _argc/_argv so El programs are free to declare their own
// local `argv` / `argc` (compiler.el itself does this) without colliding
// with the C-side parameters when fn main()'s body is folded in below.
@@ -2638,7 +2785,7 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
let main_decl = cg_stmt(stmt, " ", main_decl)
}
}
// Release AST node after final use each stmt is fully processed
// Release AST node after final use - each stmt is fully processed
// by this point (forward decls, fn defs, top-level lets, and now
// the main-body pass are all done). Releasing here prevents the
// accumulated AST from exhausting memory on large source files.
@@ -2661,16 +2808,16 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
// Emit any accumulated capability-violation #error directives. cc
// will fail on the first one and surface the message; placement at
// the bottom is fine preprocessor errors halt the build wherever
// the bottom is fine - preprocessor errors halt the build wherever
// they appear.
emit_cap_violations()
// Same for builtin-arity violations: cc halts on the first #error,
// so a misuse of a known builtin (wrong arg count) fails the build
// with a clear message naming the builtin and its expected arity.
emit_arity_violations()
// Temporal-type violations (Instant + Instant, Duration + Int, ).
// Temporal-type violations (Instant + Instant, Duration + Int, -).
emit_time_violations()
// Return empty string output was streamed via println
// Return empty string - output was streamed via println
""
}
+106 -7
View File
@@ -79,6 +79,74 @@ fn strip_flags(argv: [String]) -> [String] {
return out
}
// Detect --emit-header flag in argv.
fn detect_emit_header(argv: [String]) -> Bool {
let n: Int = native_list_len(argv)
let i = 0
while i < n {
let a: String = native_list_get(argv, i)
if str_eq(a, "--emit-header") { return true }
let i = i + 1
}
return false
}
// Reconstruct an El type annotation string from a parsed type node.
fn type_node_to_el(t: Map<String, Any>) -> String {
let k: String = t["kind"]
if str_eq(k, "Simple") { return t["name"] }
if str_eq(k, "List") {
let inner: String = type_node_to_el(t["inner"])
return "[" + inner + "]"
}
if str_eq(k, "Map") {
let kt: String = type_node_to_el(t["key"])
let vt: String = type_node_to_el(t["val"])
return "Map<" + kt + ", " + vt + ">"
}
"Any"
}
// emit_header write a .elh file from parsed statements.
// Scans for FnDef nodes and emits 'extern fn' declarations.
fn emit_header(stmts: [Map<String, Any>], hdr_path: String) -> Void {
let n: Int = native_list_len(stmts)
let i = 0
let parts: [String] = native_list_empty()
let parts = native_list_append(parts, "// auto-generated by elc --emit-header — do not edit\n")
while i < n {
let stmt = native_list_get(stmts, i)
let kind: String = stmt["stmt"]
if str_eq(kind, "FnDef") {
let name: String = stmt["name"]
if !str_eq(name, "main") {
let params = stmt["params"]
let ret_type: String = stmt["ret_type"]
// build param list
let np: Int = native_list_len(params)
let pi = 0
let param_parts: [String] = native_list_empty()
while pi < np {
let param = native_list_get(params, pi)
let pname: String = param["name"]
let ptype: String = param["type"]
if str_eq(ptype, "") { let ptype = "Any" }
let param_parts = native_list_append(param_parts, pname + ": " + ptype)
let pi = pi + 1
}
let params_str: String = str_join(param_parts, ", ")
let ret_str: String = ret_type
if str_eq(ret_str, "") { let ret_str = "Any" }
let sig: String = "extern fn " + name + "(" + params_str + ") -> " + ret_str
let parts = native_list_append(parts, sig + "\n")
}
}
let i = i + 1
}
let content: String = str_join(parts, "")
let ok: Bool = fs_write(hdr_path, content)
}
// Import resolution
//
// elc supports two forms of import:
@@ -135,6 +203,9 @@ fn parse_import_line(trimmed: String, dir: String) -> String {
// source text with every imported module's body inlined ahead of the entry
// source, deduplicated by absolute path. Uses state_set to track which paths
// have already been pulled in for this run.
//
// Accumulates chunks into lists and joins once at the end to avoid the O(n²)
// memory growth caused by repeated `prefix = prefix + chunk` concatenation.
fn resolve_imports(src_path: String) -> String {
let seen_key: String = "__elc_imp__:" + src_path
let already: String = state_get(seen_key)
@@ -146,22 +217,36 @@ fn resolve_imports(src_path: String) -> String {
let lines: [String] = str_split(source, "\n")
let n: Int = native_list_len(lines)
// First pass: pull in every import body ahead of this file's body.
let prefix: String = ""
let body: String = ""
// Collect chunks into lists O(1) amortized per append.
// Join once at the end O(n) single pass.
let prefix_chunks: [String] = native_list_empty()
let body_chunks: [String] = native_list_empty()
let i: Int = 0
while i < n {
let line: String = native_list_get(lines, i)
let trimmed: String = str_trim(line)
let imp_path: String = parse_import_line(trimmed, dir)
if !str_eq(imp_path, "") {
let prefix = prefix + resolve_imports(imp_path)
// Use pre-compiled header if available (separate compilation).
// Only check .elh for imported files never for the entry file itself.
let imp_elh_path: String = str_slice(imp_path, 0, str_len(imp_path) - 3) + ".elh"
let imp_elh: String = fs_read(imp_elh_path)
if !str_eq(imp_elh, "") {
// Header exists: mark the .el as seen (so it won't be re-inlined
// if something else also imports it) and use the header text.
let seen_imp_key: String = "__elc_imp__:" + imp_path
state_set(seen_imp_key, "1")
let prefix_chunks = native_list_append(prefix_chunks, imp_elh)
} else {
let imp_body: String = resolve_imports(imp_path)
let prefix_chunks = native_list_append(prefix_chunks, imp_body)
}
} else {
let body = body + line + "\n"
let body_chunks = native_list_append(body_chunks, line + "\n")
}
let i = i + 1
}
return prefix + body
return str_join(prefix_chunks, "") + str_join(body_chunks, "")
}
// main CLI entry point.
@@ -176,13 +261,27 @@ fn main() -> Void {
// (Section 1.5 of the language spec). detect_target itself is fine
// because the function-name position has no token-class restriction.
let tgt: String = detect_target(argv)
let do_emit_header: Bool = detect_emit_header(argv)
let positional: [String] = strip_flags(argv)
let argc: Int = native_list_len(positional)
if argc < 1 {
println("el-compiler: usage: elc [--target=c|js] <source.el> [<output>]")
println("el-compiler: usage: elc [--target=c|js] [--emit-header] <source.el> [<output>]")
exit(1)
}
let src_path: String = native_list_get(positional, 0)
// When --emit-header is requested, parse the source file directly
// (without inlining imports) and write out a .elh file alongside the .c.
if do_emit_header {
let raw_source: String = fs_read(src_path)
let hdr_tokens: [Map<String, Any>] = lex(raw_source)
let hdr_stmts: [Map<String, Any>] = parse(hdr_tokens)
el_release(hdr_tokens)
let hdr_path: String = str_slice(src_path, 0, str_len(src_path) - 3) + ".elh"
emit_header(hdr_stmts, hdr_path)
el_release(hdr_stmts)
}
let source: String = resolve_imports(src_path)
let out: String = compile_dispatch(tgt, source)
if argc >= 2 {
+27 -26
View File
@@ -146,6 +146,7 @@ fn keyword_kind(word: String) -> String {
if word == "engine" { return "Engine" }
if word == "accessor" { return "Accessor" }
if word == "vessel" { return "Vessel" }
if word == "extern" { return "Extern" }
""
}
@@ -156,7 +157,7 @@ fn keyword_kind(word: String) -> String {
// Returns { "text": ..., "pos": i }
fn scan_digits(chars: [String], start: Int, total: Int) -> Map<String, Any> {
let i = start
let text = ""
let parts: [String] = native_list_empty()
let running = true
while running {
if i >= total {
@@ -164,20 +165,20 @@ fn scan_digits(chars: [String], start: Int, total: Int) -> Map<String, Any> {
} else {
let ch: String = native_list_get(chars, i)
if lex_is_digit(ch) {
let text = text + ch
let parts = native_list_append(parts, ch)
let i = i + 1
} else {
let running = false
}
}
}
{ "text": text, "pos": i }
{ "text": str_join(parts, ""), "pos": i }
}
// scan_ident advance i while chars[i] is alphanumeric or underscore
fn scan_ident(chars: [String], start: Int, total: Int) -> Map<String, Any> {
let i = start
let text = ""
let parts: [String] = native_list_empty()
let running = true
while running {
if i >= total {
@@ -185,14 +186,14 @@ fn scan_ident(chars: [String], start: Int, total: Int) -> Map<String, Any> {
} else {
let ch: String = native_list_get(chars, i)
if is_alnum_or_underscore(ch) {
let text = text + ch
let parts = native_list_append(parts, ch)
let i = i + 1
} else {
let running = false
}
}
}
{ "text": text, "pos": i }
{ "text": str_join(parts, ""), "pos": i }
}
// Code-bearing string detection + comment strip
@@ -253,7 +254,7 @@ fn looks_like_code(s: String) -> Bool {
fn strip_code_comments(s: String) -> String {
let chars: [String] = native_string_chars(s)
let total: Int = native_list_len(chars)
let out = ""
let out_parts: [String] = native_list_empty()
let i = 0
let in_squote = false
let in_dquote = false
@@ -269,11 +270,11 @@ fn strip_code_comments(s: String) -> String {
if in_js_string {
// Backslash escape: consume next char verbatim regardless of which.
if ch == "\\" {
let out = out + ch
let out_parts = native_list_append(out_parts, ch)
let next_i = i + 1
if next_i < total {
let nc: String = native_list_get(chars, next_i)
let out = out + nc
let out_parts = native_list_append(out_parts, nc)
let prev = nc
let i = next_i + 1
} else {
@@ -292,7 +293,7 @@ fn strip_code_comments(s: String) -> String {
}
}
}
let out = out + ch
let out_parts = native_list_append(out_parts, ch)
let prev = ch
let i = i + 1
}
@@ -308,7 +309,7 @@ fn strip_code_comments(s: String) -> String {
if next_ch == "/" {
// URL guard: prev char ':' means this is "://", not a comment.
if prev == ":" {
let out = out + ch
let out_parts = native_list_append(out_parts, ch)
let prev = ch
let i = i + 1
} else {
@@ -360,7 +361,7 @@ fn strip_code_comments(s: String) -> String {
}
let prev = ""
} else {
let out = out + ch
let out_parts = native_list_append(out_parts, ch)
let prev = ch
let i = i + 1
}
@@ -369,23 +370,23 @@ fn strip_code_comments(s: String) -> String {
// Open a JS string?
if ch == "'" {
let in_squote = true
let out = out + ch
let out_parts = native_list_append(out_parts, ch)
let prev = ch
let i = i + 1
} else {
if ch == "\"" {
let in_dquote = true
let out = out + ch
let out_parts = native_list_append(out_parts, ch)
let prev = ch
let i = i + 1
} else {
if ch == "`" {
let in_btick = true
let out = out + ch
let out_parts = native_list_append(out_parts, ch)
let prev = ch
let i = i + 1
} else {
let out = out + ch
let out_parts = native_list_append(out_parts, ch)
let prev = ch
let i = i + 1
}
@@ -394,14 +395,14 @@ fn strip_code_comments(s: String) -> String {
}
}
}
out
str_join(out_parts, "")
}
// scan_string scan a quoted string literal, handling \" escapes.
// Starts AFTER the opening quote. Returns { "text": content, "pos": i_after_close }
fn scan_string(chars: [String], start: Int, total: Int) -> Map<String, Any> {
let i = start
let text = ""
let parts: [String] = native_list_empty()
let running = true
while running {
if i >= total {
@@ -414,26 +415,26 @@ fn scan_string(chars: [String], start: Int, total: Int) -> Map<String, Any> {
if next_i < total {
let next_ch: String = native_list_get(chars, next_i)
if next_ch == "\"" {
let text = text + "\""
let parts = native_list_append(parts, "\"")
let i = next_i + 1
} else {
if next_ch == "n" {
let text = text + "\n"
let parts = native_list_append(parts, "\n")
let i = next_i + 1
} else {
if next_ch == "t" {
let text = text + "\t"
let parts = native_list_append(parts, "\t")
let i = next_i + 1
} else {
if next_ch == "r" {
let text = text + "\r"
let parts = native_list_append(parts, "\r")
let i = next_i + 1
} else {
if next_ch == "\\" {
let text = text + "\\"
let parts = native_list_append(parts, "\\")
let i = next_i + 1
} else {
let text = text + next_ch
let parts = native_list_append(parts, next_ch)
let i = next_i + 1
}
}
@@ -448,13 +449,13 @@ fn scan_string(chars: [String], start: Int, total: Int) -> Map<String, Any> {
let i = i + 1
let running = false
} else {
let text = text + ch
let parts = native_list_append(parts, ch)
let i = i + 1
}
}
}
}
{ "text": text, "pos": i }
{ "text": str_join(parts, ""), "pos": i }
}
// Main lexer
+23
View File
@@ -687,6 +687,29 @@ fn parse_stmt(tokens: [Map<String, Any>], pos: Int) -> Map<String, Any> {
return make_result({ "stmt": "Return", "value": val }, p)
}
// extern fn declaration (no body forward declaration for separate compilation)
if k == "Extern" {
let p = pos + 1
let k2: String = tok_kind(tokens, p)
if str_eq(k2, "Fn") {
let p = p + 1
let name: String = tok_value(tokens, p)
let p = p + 1
let r = parse_params(tokens, p)
let params = r["params"]
let p = r["pos"]
let ret_type = ""
let k3: String = tok_kind(tokens, p)
if str_eq(k3, "Arrow") {
let p = p + 1
let kt: String = tok_kind(tokens, p)
if str_eq(kt, "Ident") { let ret_type = tok_value(tokens, p) }
let p = skip_type(tokens, p)
}
return make_result({ "stmt": "ExternFn", "name": name, "params": params, "ret_type": ret_type }, p)
}
}
// fn definition
if k == "Fn" {
let p = pos + 1
+379
View File
@@ -0,0 +1,379 @@
// elb.el - El Build Coordinator
//
// The build system for El programs. Written in El. Builds El.
//
// Usage:
// elb # build from manifest.el in current dir
// elb --clean # remove generated artifacts and rebuild
// elb --dry-run # print actions without executing
// elb --jobs=N # parallel compile jobs (default: 4)
// elb --out=DIR # output directory (default: dist)
// elb --runtime=PATH # path to el_runtime.c
//
// How it works (the .NET model):
// 1. Read manifest.el to find the entry file
// 2. Walk the import graph depth-first, build topological order
// 3. For each file: if .el is newer than .elh/.c, compile with elc --emit-header
// 4. Link all .c files + el_runtime.c into the final binary
//
// Each module compiles independently - no 128K-line blobs.
// Downstream compilations read .elh headers (function signatures only),
// not source. Incremental: only recompile what changed.
// -- Flags ---------------------------------------------------------------------
fn flag_bool(argv: [String], name: String) -> Bool {
let n: Int = native_list_len(argv)
let i = 0
while i < n {
let a: String = native_list_get(argv, i)
if str_eq(a, name) { return true }
let i = i + 1
}
return false
}
fn flag_val(argv: [String], name: String, default_val: String) -> String {
let n: Int = native_list_len(argv)
let prefix: String = name + "="
let i = 0
while i < n {
let a: String = native_list_get(argv, i)
if str_starts_with(a, prefix) {
return str_slice(a, str_len(prefix), str_len(a))
}
let i = i + 1
}
return default_val
}
// -- Manifest parsing ----------------------------------------------------------
//
// Read the entry file from manifest.el:
// build { entry "soul.el" }
fn parse_manifest_entry(src: String) -> String {
let lines: [String] = str_split(src, "\n")
let n: Int = native_list_len(lines)
let i = 0
while i < n {
let line: String = native_list_get(lines, i)
let t: String = str_trim(line)
if str_starts_with(t, "entry ") {
// entry "soul.el"
let after: String = str_slice(t, 6, str_len(t))
let trimmed: String = str_trim(after)
// strip surrounding quotes
if str_starts_with(trimmed, "\"") {
let inner: String = str_slice(trimmed, 1, str_len(trimmed))
let q: Int = str_index_of(inner, "\"")
if q >= 0 {
return str_slice(inner, 0, q)
}
}
}
let i = i + 1
}
return ""
}
fn parse_manifest_name(src: String) -> String {
let lines: [String] = str_split(src, "\n")
let n: Int = native_list_len(lines)
let i = 0
while i < n {
let line: String = native_list_get(lines, i)
let t: String = str_trim(line)
if str_starts_with(t, "package ") {
let after: String = str_slice(t, 8, str_len(t))
let trimmed: String = str_trim(after)
if str_starts_with(trimmed, "\"") {
let inner: String = str_slice(trimmed, 1, str_len(trimmed))
let q: Int = str_index_of(inner, "\"")
if q >= 0 {
return str_slice(inner, 0, q)
}
}
}
let i = i + 1
}
return "out"
}
// -- Path helpers ---------------------------------------------------------------
fn dirname_of(path: String) -> String {
let n: Int = str_len(path)
let i: Int = n - 1
while i >= 0 {
let c: String = str_slice(path, i, i + 1)
if str_eq(c, "/") {
return str_slice(path, 0, i)
}
let i = i - 1
}
return "."
}
fn basename_noext(path: String) -> String {
// strip directory
let n: Int = str_len(path)
let last_slash: Int = -1
let i = 0
while i < n {
let c: String = str_slice(path, i, i + 1)
if str_eq(c, "/") { let last_slash = i }
let i = i + 1
}
let base: String = str_slice(path, last_slash + 1, n)
// strip .el extension
let bn: Int = str_len(base)
if str_ends_with(base, ".el") {
return str_slice(base, 0, bn - 3)
}
return base
}
fn path_with_ext(path: String, ext: String) -> String {
let n: Int = str_len(path)
if str_ends_with(path, ".el") {
return str_slice(path, 0, n - 3) + ext
}
return path + ext
}
fn file_is_newer(a: String, b: String) -> Bool {
// Returns true if file a is newer than file b, or if b doesn't exist.
// Uses exec_capture with stat to compare modification times.
let cmd: String = "test -f " + b + " && test " + a + " -nt " + b + " && echo yes || echo no"
let result: String = str_trim(exec_capture(cmd))
if str_eq(result, "yes") { return true }
// b doesn't exist - check with test -f
let exist_cmd: String = "test -f " + b + " && echo exists || echo missing"
let exist: String = str_trim(exec_capture(exist_cmd))
if str_eq(exist, "missing") { return true }
return false
}
// -- Import graph walker --------------------------------------------------------
//
// Walk import statements in each .el file to build the dependency graph.
// Returns a list of absolute paths in topological order (deps before dependents).
fn parse_import_path(line: String, dir: String) -> String {
let t: String = str_trim(line)
if str_starts_with(t, "import \"") {
let after: String = str_slice(t, 8, str_len(t))
let q: Int = str_index_of(after, "\"")
if q > 0 {
let mod: String = str_slice(after, 0, q)
return dir + "/" + mod
}
}
if str_starts_with(t, "from ") {
let after: String = str_slice(t, 5, str_len(t))
let sp: Int = str_index_of(after, " ")
if sp > 0 {
let mod: String = str_trim(str_slice(after, 0, sp))
if !str_eq(mod, "") {
return dir + "/" + mod + ".el"
}
}
}
return ""
}
fn walk_imports(src_path: String, visited: [String], order: [String]) -> Map<String, Any> {
// Dedup check
let n: Int = native_list_len(visited)
let i = 0
while i < n {
let v: String = native_list_get(visited, i)
if str_eq(v, src_path) {
return { "visited": visited, "order": order }
}
let i = i + 1
}
let visited = native_list_append(visited, src_path)
let source: String = fs_read(src_path)
if str_eq(source, "") {
return { "visited": visited, "order": order }
}
let dir: String = dirname_of(src_path)
let lines: [String] = str_split(source, "\n")
let ln: Int = native_list_len(lines)
let j = 0
while j < ln {
let line: String = native_list_get(lines, j)
let imp: String = parse_import_path(line, dir)
if !str_eq(imp, "") {
let r = walk_imports(imp, visited, order)
let visited = r["visited"]
let order = r["order"]
}
let j = j + 1
}
// Add self after all deps
let order = native_list_append(order, src_path)
return { "visited": visited, "order": order }
}
// -- Build ----------------------------------------------------------------------
fn compile_module(src_path: String, out_dir: String, elc_bin: String, dry_run: Bool, verbose: Bool) -> Bool {
let bname: String = basename_noext(src_path)
let c_out: String = out_dir + "/" + bname + ".c"
let elh_out: String = out_dir + "/" + bname + ".elh"
// Check if recompile needed
if !file_is_newer(src_path, c_out) {
if verbose {
println(" skip " + bname + ".el (up to date)")
}
return true
}
// elc streams C to stdout (collect mode not yet implemented); use
// shell redirection so the output lands in the file, not the terminal.
let cmd: String = elc_bin + " --emit-header " + src_path + " > " + c_out + " 2>&1"
println(" compile " + src_path)
if dry_run { return true }
let ret: Int = exec_command(cmd)
if ret != 0 {
println("elb: compile failed: " + src_path)
return false
}
// Move the generated .elh (written next to the source by elc) into
// out_dir so that #include "module.elh" lines in the generated .c
// files resolve correctly when cc is invoked with -I <out_dir>.
let src_elh: String = path_with_ext(src_path, ".elh")
let mv_cmd: String = "cp " + src_elh + " " + elh_out + " 2>/dev/null || true"
exec_command(mv_cmd)
return true
}
fn link_binary(c_files: [String], out_bin: String, runtime_path: String, out_dir: String, dry_run: Bool) -> Bool {
let n: Int = native_list_len(c_files)
let parts: [String] = native_list_empty()
// Include both the runtime dir (for el_runtime.h) and the output dir
// (for module.elh cross-module forward declarations).
let parts = native_list_append(parts, "cc -O2 -I " + dirname_of(runtime_path) + " -I " + out_dir)
let i = 0
while i < n {
let f: String = native_list_get(c_files, i)
let parts = native_list_append(parts, f)
let i = i + 1
}
let parts = native_list_append(parts, runtime_path)
let parts = native_list_append(parts, "-lcurl -lpthread")
let parts = native_list_append(parts, "-o " + out_bin)
let cmd: String = str_join(parts, " ")
println(" link " + out_bin)
if dry_run { return true }
let ret: Int = exec_command(cmd)
if ret != 0 {
println("elb: link failed")
return false
}
return true
}
// -- Main -----------------------------------------------------------------------
fn main() -> Void {
let argv: [String] = args()
let clean: Bool = flag_bool(argv, "--clean")
let dry_run: Bool = flag_bool(argv, "--dry-run")
let verbose: Bool = flag_bool(argv, "--verbose")
let out_dir: String = flag_val(argv, "--out", "dist")
let elc_bin: String = flag_val(argv, "--elc", "elc")
let runtime: String = flag_val(argv, "--runtime", "")
// Find manifest
let manifest_src: String = fs_read("manifest.el")
if str_eq(manifest_src, "") {
println("elb: no manifest.el found in current directory")
exit(1)
}
let pkg_name: String = parse_manifest_name(manifest_src)
let entry: String = parse_manifest_entry(manifest_src)
if str_eq(entry, "") {
println("elb: manifest.el has no 'entry' declaration")
exit(1)
}
println("elb: building " + pkg_name + " (entry: " + entry + ")")
// Locate runtime
let runtime_path: String = runtime
if str_eq(runtime_path, "") {
// Try to find el_runtime.c relative to elc binary
let which_out: String = str_trim(exec_capture("which " + elc_bin + " 2>/dev/null"))
if !str_eq(which_out, "") {
let elc_dir: String = dirname_of(which_out)
runtime_path = elc_dir + "/../el-compiler/runtime/el_runtime.c"
}
}
if str_eq(runtime_path, "") {
println("elb: cannot locate el_runtime.c - use --runtime=PATH")
exit(1)
}
// Ensure output directory
let mkdir_ret: Int = exec_command("mkdir -p " + out_dir)
// Clean if requested
if clean {
println("elb: cleaning " + out_dir)
if !dry_run {
let rm_ret: Int = exec_command("rm -f " + out_dir + "/*.c " + out_dir + "/*.elh")
}
}
// Walk import graph from entry file
let empty_visited: [String] = native_list_empty()
let empty_order: [String] = native_list_empty()
let r = walk_imports(entry, empty_visited, empty_order)
let order: [String] = r["order"]
let total: Int = native_list_len(order)
println("elb: " + native_int_to_str(total) + " modules in build graph")
// Compile each module
let c_files: [String] = native_list_empty()
let i = 0
let ok = true
while i < total {
let src: String = native_list_get(order, i)
let bname: String = basename_noext(src)
let c_out: String = out_dir + "/" + bname + ".c"
let compiled: Bool = compile_module(src, out_dir, elc_bin, dry_run, verbose)
if !compiled {
let ok = false
let i = total
} else {
let c_files = native_list_append(c_files, c_out)
}
let i = i + 1
}
if !ok {
println("elb: build failed")
exit(1)
}
// Link
let out_bin: String = out_dir + "/" + pkg_name
let linked: Bool = link_binary(c_files, out_bin, runtime_path, out_dir, dry_run)
if !linked {
println("elb: link failed")
exit(1)
}
println("elb: done -> " + out_bin)
}
+1270 -20
View File
File diff suppressed because it is too large Load Diff
+80
View File
@@ -0,0 +1,80 @@
#!/usr/bin/env bash
# install.sh — Install the El SDK from the latest Gitea release.
#
# Usage:
# bash install.sh
# EL_VERSION=v1.0.0 bash install.sh # pin a specific release tag
# EL_PREFIX=/opt/el bash install.sh # custom install prefix
#
# Environment variables:
# EL_VERSION Release tag to download (default: latest)
# EL_PREFIX Install prefix (default: /usr/local)
set -euo pipefail
REPO_BASE="https://git.neuralplatform.ai/neuron-technologies/el"
VERSION="${EL_VERSION:-latest}"
PREFIX="${EL_PREFIX:-/usr/local}"
BIN_DIR="${PREFIX}/bin"
LIB_DIR="${PREFIX}/lib/el"
RELEASE_BASE="${REPO_BASE}/releases/download/${VERSION}"
echo "==> Installing El SDK ${VERSION}"
echo " prefix : ${PREFIX}"
echo " bin : ${BIN_DIR}"
echo " lib : ${LIB_DIR}"
echo
# Create directories
mkdir -p "${BIN_DIR}" "${LIB_DIR}"
# Download helper
download() {
local url="$1"
local dest="$2"
echo " Downloading $(basename "${dest}")..."
if command -v curl >/dev/null 2>&1; then
curl -fsSL "${url}" -o "${dest}"
elif command -v wget >/dev/null 2>&1; then
wget -q "${url}" -O "${dest}"
else
echo "Error: neither curl nor wget found" >&2
exit 1
fi
}
# Download assets
TMP_DIR="$(mktemp -d)"
trap 'rm -rf "${TMP_DIR}"' EXIT
download "${RELEASE_BASE}/elc" "${TMP_DIR}/elc"
download "${RELEASE_BASE}/el_runtime.c" "${TMP_DIR}/el_runtime.c"
download "${RELEASE_BASE}/el_runtime.h" "${TMP_DIR}/el_runtime.h"
# Install
install -m 755 "${TMP_DIR}/elc" "${BIN_DIR}/elc"
install -m 644 "${TMP_DIR}/el_runtime.c" "${LIB_DIR}/el_runtime.c"
install -m 644 "${TMP_DIR}/el_runtime.h" "${LIB_DIR}/el_runtime.h"
echo
echo "==> El SDK installed successfully"
echo
echo " elc binary : ${BIN_DIR}/elc"
echo " runtime : ${LIB_DIR}/el_runtime.c"
echo " header : ${LIB_DIR}/el_runtime.h"
echo
echo "Add the following to your Makefile to build El programs:"
echo
echo " EL_LIB := ${LIB_DIR}"
echo " ELC := elc"
echo " CC := cc"
echo " CFLAGS := -std=c11 -O2 -I\$(EL_LIB)"
echo
echo " dist/myapp.c: src/myapp.el"
echo " \t\$(ELC) src/myapp.el > dist/myapp.c"
echo
echo " dist/myapp: dist/myapp.c"
echo " \t\$(CC) \$(CFLAGS) -o dist/myapp dist/myapp.c \$(EL_LIB)/el_runtime.c -lcurl -lpthread"
echo
+28
View File
@@ -0,0 +1,28 @@
# El Compiler Release v1.0.0 — 2026-05-02
## Components
- `bootstrap.py` — El language compiler (Python, recursive descent parser, emits C)
- `el_runtime.c` — El runtime (C, HTTP server, engram, DHARMA, LLM chain)
- `el_runtime.h` — Runtime public API header
## Changes in this release
### Critical bug fixes
- `state_set`/`state_get` are now thread-safe (pthread_mutex). Was racing across 64 worker threads.
- `looks_like_string` threshold raised from 1,000,000 to 4GB. Unix timestamps were being dereferenced as heap pointers.
- `fs_read` guards against negative `ftell` result (pipe/special file overflow).
### Engram architecture (major)
- Two-layer activation: `background_activation` (Layer 1, broad fan-out) + `working_memory_weight` (Layer 2, executive filter)
- Inhibitory edges: `EngramEdge.inhibitory` flag suppresses working memory promotion without affecting background activation
- Suppression memory: `suppression_count` — nodes activated-but-suppressed accumulate pressure toward breakthrough
- Temporal decay: `temporal_decay_rate`, `created_at`, `last_activated_at`, `activation_count` on EngramNode
- Per-type activation thresholds (Safety: 0.05, Canonical: 0.15, Lesson: 0.25, Note: 0.40)
- Temporal range query: `engram_query_range(start_ms, end_ms)`
- Layered consciousness: `EngramLayer` struct, `layer_id` on nodes and edges, `EngramStore.layers[]`
- Layer 0 override pass: safety layer fires last and cannot be suppressed
## SHA256
bootstrap.py
el_runtime.c
el_runtime.h
File diff suppressed because it is too large Load Diff
+507
View File
@@ -0,0 +1,507 @@
/*
* el_runtime.h El language C runtime header
*
* Declares all built-in functions available to compiled El programs.
* Include this in every generated .c file.
*
* Value model:
* All El values are represented as el_val_t (= int64_t).
* On 64-bit systems a pointer fits in int64_t.
* String values are cast: (el_val_t)(uintptr_t)"hello"
* Integer values are stored directly.
* This lets arithmetic work naturally while still passing strings around.
*
* Type conventions (El -> C):
* String -> el_val_t (holds const char* via uintptr_t cast)
* Int -> el_val_t
* Bool -> el_val_t (0 = false, nonzero = true)
* Any -> el_val_t
* Void -> void
*
* Macros for convenience:
* EL_STR(s) cast string literal to el_val_t
* EL_CSTR(v) cast el_val_t back to const char*
* EL_INT(v) identity el_val_t is already int64_t
*
* Link requirements:
* -lcurl required for the HTTP client (http_get, http_post, llm_*).
* -lpthread required for the HTTP server (one detached thread per
* connection, capped at 64 concurrent).
* -loqs optional; required only when liboqs is installed and the
* pq_* / sha3_256_hex entry points are needed. Detected at
* compile time via __has_include(<oqs/oqs.h>).
* -lcrypto optional; pulled in alongside -loqs. Used for X25519 in
* pq_hybrid_* and HKDF-SHA256 derivation.
*
* Canonical compile command:
* cc -std=c11 -I el-compiler/runtime -lcurl -lpthread \
* -o <out> <prog>.c el-compiler/runtime/el_runtime.c
*
* With liboqs (post-quantum stack):
* cc -std=c11 -I el-compiler/runtime -lcurl -lpthread -loqs -lcrypto \
* -o <out> <prog>.c el-compiler/runtime/el_runtime.c
*/
#pragma once
#include <stdint.h>
#include <stdlib.h>
typedef int64_t el_val_t;
#define EL_STR(s) ((el_val_t)(uintptr_t)(s))
#define EL_CSTR(v) ((const char*)(uintptr_t)(v))
#define EL_INT(v) (v)
#define EL_NULL ((el_val_t)0)
/* Float values share the el_val_t (int64) slot via a bit-cast.
* The codegen emits Float literals as `el_from_float(<dbl>)` so the
* underlying bits represent the IEEE 754 double. Float-aware builtins
* (math, format, json) round-trip via these helpers. */
static inline double el_to_float(el_val_t v) {
union { int64_t i; double f; } u;
u.i = (int64_t)v;
return u.f;
}
static inline el_val_t el_from_float(double f) {
union { double f; int64_t i; } u;
u.f = f;
return (el_val_t)u.i;
}
#ifdef __cplusplus
extern "C" {
#endif
/* ── I/O ──────────────────────────────────────────────────────────────────── */
void println(el_val_t s);
void print(el_val_t s);
el_val_t readline(void);
/* ── String builtins ─────────────────────────────────────────────────────── */
el_val_t el_str_concat(el_val_t a, el_val_t b);
el_val_t str_eq(el_val_t a, el_val_t b);
el_val_t str_starts_with(el_val_t s, el_val_t prefix);
el_val_t str_ends_with(el_val_t s, el_val_t suffix);
el_val_t str_len(el_val_t s);
el_val_t str_concat(el_val_t a, el_val_t b);
el_val_t int_to_str(el_val_t n);
el_val_t str_to_int(el_val_t s);
el_val_t str_slice(el_val_t s, el_val_t start, el_val_t end);
el_val_t str_contains(el_val_t s, el_val_t sub);
el_val_t str_replace(el_val_t s, el_val_t from, el_val_t to);
el_val_t str_to_upper(el_val_t s);
el_val_t str_to_lower(el_val_t s);
el_val_t str_trim(el_val_t s);
/* ── Math ────────────────────────────────────────────────────────────────── */
el_val_t el_abs(el_val_t n);
el_val_t el_max(el_val_t a, el_val_t b);
el_val_t el_min(el_val_t a, el_val_t b);
/* ── Refcount (ARC) ──────────────────────────────────────────────────────────
* Lists and Maps carry a refcount. Strings and ints do not el_retain and
* el_release are safe no-ops on non-refcounted values (they sniff a magic
* header at offset 0 and only act if the magic matches).
*
* Codegen emits these at let-binding shadowing, function entry (params), and
* function exit (locals other than the returned value). The refcount lets
* el_list_append and el_map_set mutate in place when uniquely owned (cheap)
* and copy-on-write when shared (preserves persistent semantics across
* accumulator patterns in the compiler itself). */
void el_retain(el_val_t v);
void el_release(el_val_t v);
/* ── List ────────────────────────────────────────────────────────────────── */
el_val_t el_list_new(el_val_t count, ...);
el_val_t el_list_len(el_val_t list);
el_val_t el_list_get(el_val_t list, el_val_t index);
el_val_t el_list_append(el_val_t list, el_val_t elem);
el_val_t el_list_empty(void);
el_val_t el_list_clone(el_val_t list);
/* ── Map ─────────────────────────────────────────────────────────────────── */
el_val_t el_map_new(el_val_t pair_count, ...);
el_val_t el_get_field(el_val_t map, el_val_t key);
el_val_t el_map_get(el_val_t map, el_val_t key);
el_val_t el_map_set(el_val_t map, el_val_t key, el_val_t value);
/* ── HTTP ─────────────────────────────────────────────────────────────────── */
el_val_t http_get(el_val_t url);
el_val_t http_post(el_val_t url, el_val_t body);
el_val_t http_post_json(el_val_t url, el_val_t json_body);
el_val_t http_get_with_headers(el_val_t url, el_val_t headers_map);
el_val_t http_post_with_headers(el_val_t url, el_val_t body, el_val_t headers_map);
el_val_t http_post_form_auth(el_val_t url, el_val_t form_body, el_val_t auth_header);
el_val_t http_delete(el_val_t url);
void http_serve(el_val_t port, el_val_t handler);
void http_set_handler(el_val_t name);
/* HTTP server v2 ─────────────────────────────────────────────────────────────
* Same dispatch model as http_serve, but the handler signature is widened:
*
* el_val_t handler(method, path, headers_map, body)
*
* `headers_map` is an ElMap from lowercased header name header value (both
* Strings). Repeated headers are joined with ", " per RFC 7230.
*
* Response value: the handler may return either
* (a) a plain body string same auto-content-type / 200-OK behaviour as
* http_serve (3-arg) or
* (b) a response envelope built with `http_response(status, headers_json,
* body)`. The runtime detects the envelope discriminator
* `"el_http_response":1` at the start of the returned string and
* unpacks status / headers / body before sending.
*
* The 3-arg http_serve(port, handler) remains supported unchanged for
* existing handlers (e.g. products/web/server.el): it dispatches with
* (method, path, body), hardcodes 200 OK, and auto-detects content type. */
void http_serve_v2(el_val_t port, el_val_t handler);
void http_set_handler_v2(el_val_t name);
/* Build an HTTP response envelope. `headers_json` should be a JSON object
* literal like `{"WWW-Authenticate":"Basic"}` (or "" / "{}" for none). The
* returned string carries the discriminator `{"el_http_response":1,...}`
* which the runtime's send-path detects and unpacks. Detection happens
* uniformly inside http_send_response, so a 3-arg handler may also return
* an envelope. The 3-arg variant remains documented as a fixed 200-OK
* auto-content-type contract for legacy handlers that return plain bodies. */
el_val_t http_response(el_val_t status, el_val_t headers_json, el_val_t body);
/* HTTP timeout — every libcurl request honors EL_HTTP_TIMEOUT_MS (default
* 60000ms). Read lazily on first use, so setting the env var any time before
* the first http_* call is sufficient. */
/* Streaming variants — write the response body straight to a file via
* libcurl's CURLOPT_WRITEFUNCTION = fwrite. These bypass the el_val_t string
* wrapper entirely, so binary payloads (audio/mpeg, image/png, etc.) survive
* embedded NUL bytes that would truncate a strlen()-based code path.
*
* Both honor EL_HTTP_TIMEOUT_MS, follow redirects, and accept the same
* `headers_map` shape as http_post_with_headers (ElMap of StringString).
*
* Return value: 1 on success (file fully written), 0 on any failure
* (network, file open, partial write). On failure the output file is removed
* so callers cannot mistake a partially-written file for a valid one. */
el_val_t http_post_to_file(el_val_t url, el_val_t body, el_val_t headers_map, el_val_t output_path);
el_val_t http_get_to_file(el_val_t url, el_val_t headers_map, el_val_t output_path);
/* ── URL encoding ────────────────────────────────────────────────────────── */
el_val_t url_encode(el_val_t s); /* RFC 3986 unreserved set */
el_val_t url_decode(el_val_t s); /* '+' → space, %XX → byte */
/* ── Filesystem ──────────────────────────────────────────────────────────── */
el_val_t fs_read(el_val_t path);
el_val_t fs_write(el_val_t path, el_val_t content);
el_val_t fs_list(el_val_t path);
el_val_t fs_exists(el_val_t path);
el_val_t fs_mkdir(el_val_t path); /* mkdir -p, mode 0755 */
/* Length-explicit binary write. `length` is an Int (el_val_t holding the
* byte count). The caller knows the length from context typically because
* `bytes` came from base64_decode (which produces a magic-tagged binary
* buffer with embedded NULs possible) and the caller already tracks the
* decoded length, OR because the bytes came from a fixed-size source
* (sha256_bytes = 32, hmac_sha256_bytes = 32). Bypasses strlen entirely.
*
* Returns 1 on success, 0 on failure (invalid path, can't open, partial
* write, negative length). On partial-write failure, the file is removed
* so callers cannot read back a truncated artefact. */
el_val_t fs_write_bytes(el_val_t path, el_val_t bytes, el_val_t length);
/* ── JSON ────────────────────────────────────────────────────────────────── */
el_val_t json_get(el_val_t json, el_val_t key);
el_val_t json_parse(el_val_t s);
el_val_t json_stringify(el_val_t v);
el_val_t json_get_string(el_val_t json_str, el_val_t key);
el_val_t json_get_int(el_val_t json_str, el_val_t key);
el_val_t json_get_float(el_val_t json_str, el_val_t key);
el_val_t json_get_bool(el_val_t json_str, el_val_t key);
el_val_t json_get_raw(el_val_t json_str, el_val_t key);
el_val_t json_set(el_val_t json_str, el_val_t key, el_val_t value);
el_val_t json_array_len(el_val_t json_str);
el_val_t json_array_get(el_val_t json_str, el_val_t index);
el_val_t json_array_get_string(el_val_t json_str, el_val_t index);
/* ── Time ────────────────────────────────────────────────────────────────── */
el_val_t time_now(void);
el_val_t time_now_utc(void);
el_val_t sleep_secs(el_val_t secs);
el_val_t sleep_ms(el_val_t ms);
el_val_t time_format(el_val_t ts, el_val_t fmt);
el_val_t time_to_parts(el_val_t ts);
el_val_t time_from_parts(el_val_t secs, el_val_t ns, el_val_t tz);
el_val_t time_add(el_val_t ts, el_val_t n, el_val_t unit);
el_val_t time_diff(el_val_t ts1, el_val_t ts2, el_val_t unit);
/* ── UUID ────────────────────────────────────────────────────────────────── */
el_val_t uuid_new(void);
el_val_t uuid_v4(void);
/* ── Environment ─────────────────────────────────────────────────────────── */
el_val_t env(el_val_t key);
/* ── In-process state K/V ────────────────────────────────────────────────── */
el_val_t state_set(el_val_t key, el_val_t value);
el_val_t state_get(el_val_t key);
el_val_t state_del(el_val_t key);
el_val_t state_keys(void);
/* ── Float formatting ────────────────────────────────────────────────────── */
el_val_t float_to_str(el_val_t f);
el_val_t int_to_float(el_val_t n);
el_val_t float_to_int(el_val_t f);
el_val_t format_float(el_val_t f, el_val_t decimals);
el_val_t decimal_round(el_val_t f, el_val_t decimals);
el_val_t str_to_float(el_val_t s);
/* ── Math (Float-aware) ──────────────────────────────────────────────────── */
el_val_t math_sqrt(el_val_t f);
el_val_t math_log(el_val_t f);
el_val_t math_ln(el_val_t f);
el_val_t math_sin(el_val_t f);
el_val_t math_cos(el_val_t f);
el_val_t math_pi(void);
/* ── String additions ────────────────────────────────────────────────────── */
el_val_t str_index_of(el_val_t s, el_val_t sub);
el_val_t str_split(el_val_t s, el_val_t sep);
el_val_t str_char_at(el_val_t s, el_val_t i);
el_val_t str_char_code(el_val_t s, el_val_t i);
el_val_t str_pad_left(el_val_t s, el_val_t width, el_val_t pad);
el_val_t str_pad_right(el_val_t s, el_val_t width, el_val_t pad);
el_val_t str_format(el_val_t template, el_val_t data);
el_val_t str_lower(el_val_t s);
el_val_t str_upper(el_val_t s);
/* ── List additions ──────────────────────────────────────────────────────── */
el_val_t list_push(el_val_t list, el_val_t elem);
el_val_t list_push_front(el_val_t list, el_val_t elem);
el_val_t list_join(el_val_t list, el_val_t sep);
el_val_t list_range(el_val_t start, el_val_t end);
/* ── Bool helpers ────────────────────────────────────────────────────────── */
el_val_t bool_to_str(el_val_t b);
/* ── Numeric parsing ─────────────────────────────────────────────────────── */
el_val_t parse_int(el_val_t s, el_val_t default_val);
/* ── Process ─────────────────────────────────────────────────────────────── */
void exit_program(el_val_t code);
el_val_t getpid_now(void);
/* ── CGI identity ─────────────────────────────────────────────────────────────
* Called at the start of main() in CGI programs (those with a `cgi {}` block).
* Records the program's DHARMA identity before any other code executes. */
void el_cgi_init(el_val_t name, el_val_t dharma_id, el_val_t principal,
el_val_t network, el_val_t engram);
/* ── DHARMA network builtins ─────────────────────────────────────────────────
* Available to CGI programs (declared with a `cgi {}` block).
*
* Peers are addressed by `dharma_id` of the form
* "<registry-id>@<transport-url>" e.g. "ntn-genesis@http://localhost:7770"
* If the @<url> portion is omitted, transport defaults to
* "http://localhost:7770" (the local CGI daemon assumption).
*
* Wire protocol (all peers expose):
* POST <url>/dharma/recv { channel, from, content } response body
* POST <url>/dharma/event { type, payload, source, timestamp }
* POST <url>/api/activate { query } list of nodes
*
* Hosting application's responsibility: an El program with a `cgi {}` block
* runs http_serve() with its own request handler; that handler should route
* "/dharma/event" requests by calling el_runtime_dharma_event_arrive() so
* incoming events feed dharma_field() queues. The runtime itself does not
* intercept any /dharma path. */
el_val_t dharma_connect(el_val_t cgi_id);
el_val_t dharma_send(el_val_t channel, el_val_t content);
el_val_t dharma_activate(el_val_t query);
void dharma_emit(el_val_t event_type, el_val_t payload);
el_val_t dharma_field(el_val_t event_type);
void dharma_strengthen(el_val_t cgi_id, el_val_t weight);
el_val_t dharma_relationship(el_val_t cgi_id);
el_val_t dharma_peers(void);
/* Public C API: called by an El program's HTTP handler when a /dharma/event
* request arrives. Pushes onto the per-event-type queue and signals any
* pending dharma_field() blockers. All three arguments must be NUL-terminated
* C strings (or NULL then treated as empty). */
void el_runtime_dharma_event_arrive(const char* event_type,
const char* payload,
const char* source);
/* ── Engram local graph primitives ───────────────────────────────────────────
* Operate on the CGI's local Engram knowledge graph.
* `engram_activate` queries the local graph only; `dharma_activate` is
* network-wide across all connected CGI graphs. */
el_val_t engram_node(el_val_t content, el_val_t node_type, el_val_t salience);
el_val_t engram_node_full(el_val_t content, el_val_t node_type, el_val_t label,
el_val_t salience, el_val_t importance, el_val_t confidence,
el_val_t tier, el_val_t tags);
el_val_t engram_get_node(el_val_t id);
void engram_strengthen(el_val_t node_id);
void engram_forget(el_val_t node_id);
el_val_t engram_node_count(void);
el_val_t engram_search(el_val_t query, el_val_t limit);
el_val_t engram_scan_nodes(el_val_t limit, el_val_t offset);
void engram_connect(el_val_t from_id, el_val_t to_id, el_val_t weight, el_val_t relation);
el_val_t engram_edge_between(el_val_t from_id, el_val_t to_id);
el_val_t engram_neighbors(el_val_t node_id);
el_val_t engram_neighbors_filtered(el_val_t node_id, el_val_t max_depth, el_val_t direction);
el_val_t engram_edge_count(void);
el_val_t engram_activate(el_val_t query, el_val_t depth);
el_val_t engram_save(el_val_t path);
el_val_t engram_load(el_val_t path);
/* JSON-string accessors — return pre-serialized JSON so HTTP handlers
* can pass results straight through without round-tripping ElList/ElMap
* through json_stringify. */
el_val_t engram_get_node_json(el_val_t id);
el_val_t engram_search_json(el_val_t query, el_val_t limit);
el_val_t engram_scan_nodes_json(el_val_t limit, el_val_t offset);
el_val_t engram_neighbors_json(el_val_t node_id, el_val_t max_depth, el_val_t direction);
el_val_t engram_activate_json(el_val_t query, el_val_t depth);
el_val_t engram_stats_json(void);
/* ── LLM (Anthropic API client) ─────────────────────────────────────────────
* All functions call https://api.anthropic.com/v1/messages with the API key
* from env ANTHROPIC_API_KEY. Default model when empty: claude-sonnet-4-5. */
el_val_t llm_call(el_val_t model, el_val_t prompt);
el_val_t llm_call_system(el_val_t model, el_val_t system_prompt, el_val_t user_prompt);
el_val_t llm_call_agentic(el_val_t model, el_val_t system, el_val_t user, el_val_t tools);
el_val_t llm_vision(el_val_t model, el_val_t system, el_val_t prompt, el_val_t image_url_or_b64);
el_val_t llm_models(void);
/* Register a tool handler by name. The handler is looked up via dlsym
* (mirroring http_set_handler), so any El `fn <name>(input)` compiles to
* a global C symbol that this function can locate at runtime.
* Handler signature: `el_val_t handler(el_val_t input_json)` receives
* the tool input as a JSON-string el_val_t and returns a JSON-string
* el_val_t result. Used by llm_call_agentic. */
void llm_register_tool(el_val_t name, el_val_t handler_fn_name);
/* ── args() ─────────────────────────────────────────────────────────────────
* Provides access to command-line arguments passed to the program.
* Populated by el_runtime_init_args() before main() runs. */
el_val_t args(void);
void el_runtime_init_args(int argc, char** argv);
/* ── Crypto primitives ─────────────────────────────────────────────────────
* SHA-256, HMAC-SHA-256, and base64 (standard + URL-safe).
* Self-contained no OpenSSL/libcrypto dependency. The implementations are
* adapted from public-domain reference code (Brad Conte / RFC 4648).
*
* Bytes-returning variants (sha256_bytes, hmac_sha256_bytes) return a string
* value whose contents are raw binary; callers usually feed these into
* base64_encode. Note that el_val_t strings are NUL-terminated by convention,
* so the binary payload may contain embedded NULs pass it directly into
* base64_encode (which uses an explicit length) rather than treating it as
* a printable C string.
*
* The "base64" variants emit/accept RFC 4648 standard alphabet with padding.
* The "base64url" variants use URL-safe alphabet (`-`/`_`) with no padding,
* as used in JWTs. */
el_val_t sha256_hex(el_val_t input);
el_val_t sha256_bytes(el_val_t input);
el_val_t hmac_sha256_hex(el_val_t key, el_val_t message);
el_val_t hmac_sha256_bytes(el_val_t key, el_val_t message);
el_val_t base64_encode(el_val_t input);
el_val_t base64_decode(el_val_t input);
el_val_t base64url_encode(el_val_t input);
el_val_t base64url_decode(el_val_t input);
/* Length-aware variants (internal — exposed for the rare caller that already
* has a known-length binary buffer and doesn't want to round-trip through
* a NUL-terminated el_val_t string). Sha256_bytes and hmac_sha256_bytes feed
* these implicitly. */
el_val_t el_sha256_bytes_n(const unsigned char* data, size_t len);
el_val_t el_base64_encode_n(const unsigned char* data, size_t len, int url_safe);
/* ── Post-quantum primitives (liboqs-backed) ────────────────────────────────
* All inputs/outputs hex-encoded. Algorithm choices:
* Signature: CRYSTALS-Dilithium-3 (NIST level 3, balanced)
* KEM: CRYSTALS-Kyber-768 (NIST level 3)
* Hash: SHA3-256 (Keccak) (PQ-aware protocols favour SHA3 over SHA2)
*
* If liboqs is not linked (detected via __has_include(<oqs/oqs.h>) at compile
* time), the pq_* entry points return a JSON-shaped error string so callers
* fail loudly rather than silently fall back to classical schemes:
* {"error":"liboqs not linked, post-quantum primitives unavailable"}
*
* The hybrid handshake pairs X25519 with Kyber-768 per NIST PQ guidance and
* CNSA 2.0. Combined shared secret is HKDF-SHA256(x25519_ss || kyber_ss).
* Even if Kyber falls, X25519 holds; if X25519 falls under quantum attack,
* Kyber holds. SHA3-256 also remains usable independent of liboqs (the
* Keccak permutation is PQ-OK as a primitive). */
el_val_t pq_keygen_signature(void);
el_val_t pq_sign(el_val_t secret_key_hex, el_val_t message);
el_val_t pq_verify(el_val_t public_key_hex, el_val_t message, el_val_t signature_hex);
el_val_t pq_kem_keygen(void);
el_val_t pq_kem_encaps(el_val_t public_key_hex);
el_val_t pq_kem_decaps(el_val_t secret_key_hex, el_val_t ciphertext_hex);
el_val_t pq_hybrid_keygen(void);
el_val_t pq_hybrid_handshake(el_val_t remote_pub_combined);
el_val_t sha3_256_hex(el_val_t input);
/* ── Native VM builtin aliases (for compiled El source) ─────────────────────
* These match the El VM's native_* builtins so that El source compiled
* to C can call the same names without modification. */
el_val_t native_list_get(el_val_t list, el_val_t index);
el_val_t native_list_len(el_val_t list);
el_val_t native_list_append(el_val_t list, el_val_t elem);
el_val_t native_list_empty(void);
el_val_t native_list_clone(el_val_t list);
el_val_t native_string_chars(el_val_t s);
el_val_t native_int_to_str(el_val_t n);
/* ── Method-call shorthand aliases ──────────────────────────────────────────
* The El method-call convention `obj.method(args)` compiles to
* `method(obj, args)`. These aliases expose the runtime functions under
* the short names that result from method calls in El source.
*
* Example: `myList.append(x)` `append(myList, x)` (calls this alias)
* `myList.len()` `len(myList)` (calls this alias) */
el_val_t append(el_val_t list, el_val_t elem); /* el_list_append */
el_val_t len(el_val_t list); /* el_list_len */
el_val_t get(el_val_t list, el_val_t index); /* el_list_get */
el_val_t map_get(el_val_t map, el_val_t key); /* el_map_get */
el_val_t map_set(el_val_t map, el_val_t key, el_val_t value); /* el_map_set */
#ifdef __cplusplus
}
#endif
+236
View File
@@ -0,0 +1,236 @@
# El JavaScript Backend (codegen-js)
**Status:** scaffolded. Hello-world compiles and runs. ~50% language coverage. Core runtime (~30 builtins) implemented. CGI / DHARMA / LLM / Engram intentionally stubbed.
**Authoritative files**
| File | Role |
|---|---|
| `el-compiler/src/codegen-js.el` | El → JS code generator (mirrors `codegen.el`) |
| `el-compiler/runtime/el_runtime.js` | Browser/Node runtime that compiled programs link against |
| `el-compiler/src/compiler.el` | Adds `compile_js()` and `--target=js` CLI dispatch |
| `spec/codegen-js.md` | This document |
---
## 1. Why a JS backend exists
El compiles to C today. C is the right substrate for the agent runtime, the DHARMA daemon, and Engram. But three first-class consumers of El need to **run in a browser**, where C is not an option:
1. **`el-ui/runtime/`** — the activation-based frontend framework written in JS. The long-term plan is to author components and the runtime itself in El and compile them down to JS.
2. **`cgi-studio`** — the web app for cultivating CGIs. Today it is hand-written JS. Once the JS backend is mature, the studio's UI logic can be authored in El and share types/identifier names with the CGI it cultivates.
3. **Marketplace plugin UIs** — third parties writing browser-side El that runs untrusted in a sandbox. They need a JS target.
A secondary motivation: **El-on-Node**. CLI tooling, build scripts, and tests benefit from a tight `el → js → node` cycle without a `cc` step.
---
## 2. Type representation strategy
The C backend pretends every value is `int64_t`. That is a deliberate runtime trick to avoid dynamic dispatch in generated C. JavaScript already has tagged dynamic values, so the JS backend is **simpler**: every El value is a native JS value, and the tag of `el_val_t` collapses into the JS type system.
| El type | C representation | JS representation |
|---|---|---|
| `Int` | `int64_t` (direct) | `number` (with `Number.isSafeInteger` caveat — see §6) |
| `Float` | `int64_t` bit-cast of `double` via `el_from_float` | `number` (no bit-cast — JS number IS a double) |
| `Bool` | `int64_t`, 0 = false, nonzero = true | `boolean` |
| `String` | `(int64_t)(uintptr_t)cstring` | `string` |
| `Void` | C `void` | `undefined` |
| `[T]` (List) | `el_val_t` pointer to refcounted struct | `Array<any>` |
| `Map<K,V>` | `el_val_t` pointer to refcounted struct | plain object `{[key]: any}` |
| `EL_NULL` (`0`) | `(el_val_t)0` | `null` |
| Any | `el_val_t` | `any` (no compile-time check) |
**Key consequences:**
- `+` on two strings is JS `+` (string concat) — no `el_str_concat()` runtime call needed for the common case. The runtime DOES export `el_str_concat` for the cases where codegen does not know the types.
- `==` on strings is `===` — not `str_eq()`. Same disambiguation logic as the C backend (look at left/right kind, fall back to `str_eq` for identifiers without int annotation).
- `Map` access `m["foo"]` compiles to JS `m["foo"]` (no `el_get_field`). For `Field` access (`m.foo`) we emit `m["foo"]` so it works on plain objects regardless of prototype shape.
- List access `arr[i]` is JS `arr[i]`. No bounds checking — same as C (which segfaults on bad index). Could add `el_list_get` wrapper later for safe access.
- `EL_NULL` becomes JS `null`, not `undefined`. The runtime checks for `=== null` consistently. This avoids the JS undefined/null fork and matches El's single null value.
---
## 3. Builtin runtime layer (`el_runtime.js`)
Same function names as `el_runtime.c` wherever possible, so codegen-js can emit the same call sites. The runtime is a single ES module that exposes every builtin as a named export AND attaches them to a `globalThis.__el` namespace (so generated code can do either `import * as el from './el_runtime.js'` or assume globals).
**The codegen-js generated output uses the global-namespace style:** every emitted file starts with `import './el_runtime.js'` (which side-effects the globals) so call sites stay flat — `println(x)` not `el.println(x)`. This matches the C backend's flat call surface and keeps the generated code grep-compatible across targets.
### Implemented today (~30 builtins)
| Category | Functions |
|---|---|
| I/O | `println`, `print` |
| String | `el_str_concat`, `str_concat`, `str_eq`, `str_starts_with`, `str_ends_with`, `str_len`, `int_to_str`, `str_to_int`, `str_slice`, `str_contains`, `str_replace`, `str_to_upper`, `str_to_lower`, `str_trim`, `str_index_of`, `str_split`, `str_char_at`, `str_char_code`, `str_lower`, `str_upper` |
| Math | `el_abs`, `el_max`, `el_min` |
| List | `el_list_new`, `el_list_len`, `el_list_get`, `el_list_append`, `el_list_empty`, `el_list_clone`, `list_push`, `list_join`, `list_range` |
| Map | `el_map_new`, `el_get_field`, `el_map_get`, `el_map_set` |
| HTTP | `http_get`, `http_post`, `http_post_json` (via `fetch()`, returns `Promise<string>` — see §5 async caveat) |
| FS | `fs_read`, `fs_write`, `fs_list` (Node-only, throw in browser) |
| JSON | `json_parse`, `json_stringify`, `json_get`, `json_get_string`, `json_get_int` |
| Time | `time_now`, `time_now_utc`, `sleep_secs` (Node), `sleep_ms` |
| Bool | `bool_to_str` |
| Process | `exit_program` (Node `process.exit`, throw in browser) |
| Refcount | `el_retain`, `el_release` (no-ops — JS has GC) |
| ARC method-call shortforms | `append`, `len`, `get`, `map_get`, `map_set` |
| Native VM aliases | `native_list_get`, `native_list_len`, `native_list_append`, `native_list_empty`, `native_list_clone`, `native_string_chars`, `native_int_to_str` |
| `args` | `args()` returns `process.argv.slice(2)` in Node, `[]` in browser |
| `state_*` | In-memory `Map` keyed by string |
| `env` | `process.env[k]` in Node, throws in browser |
### Stubbed (throw at runtime)
Every function in this list compiles successfully but throws `Error("not supported in JS target — needs server-side delegation: <name>")` when called. This is a **runtime** error, not a compile error, so it doesn't block compilation of code that has dead-code paths through these functions.
- All `dharma_*` (membership in DHARMA network requires the daemon)
- All `engram_*` (needs the embedded SQLite + activation engine — could be reimplemented in JS later)
- All `llm_*` (CORS + API key handling — must go through a server-side proxy)
- `http_serve` (browsers don't host servers; Node could, but that's a separate runtime mode)
- `el_cgi_init` (CGI identity is a server-side concept)
- Crypto: `sha256_*`, `hmac_sha256_*`, `base64*` (deferred — can use `crypto.subtle` later)
### Browser-side specific behavior
When running in a browser:
- `println` / `print` map to `console.log` (no stdout in browsers)
- `http_get` / `http_post` use `fetch()` (CORS applies)
- `fs_*` throws (browsers have no fs)
- `args()` returns `[]`
- `env(k)` throws (or could read from a global config object — TBD)
When running in Node:
- `println` / `print` map to `console.log` and `process.stdout.write`
- `fs_*` use `node:fs/promises` (sync versions for the simple cases)
- `args()` returns `process.argv.slice(2)`
- `env(k)` returns `process.env[k] ?? null`
The runtime auto-detects via `typeof window === 'undefined'`.
---
## 4. Tradeoffs vs the C backend
| Concern | C backend | JS backend |
|---|---|---|
| **Static types** | El's `Int` becomes `int64_t`, real arithmetic | El's `Int` becomes `number` — loses precision past 2^53 |
| **Linking model** | Static link against `el_runtime.c` + libcurl + libpthread | ES module import of `el_runtime.js` |
| **Dynamic dispatch** | `dlsym` for `http_set_handler` / `llm_register_tool` (requires `-rdynamic`) | JS function value lookup via `globalThis[name]` — no compiler flag |
| **Tool registry** | dlsym walks symbol table; tool fns must be top-level C symbols | Tool fns live as exports of the generated module; trivially callable |
| **Memory model** | Refcounted lists/maps with `el_retain`/`el_release` to avoid leaks | JS GC handles all of it; `el_retain`/`el_release` are no-ops |
| **`+` overload** | Has to dispatch in codegen between `el_str_concat` and integer `+` because at C level both are `int64_t` | JS `+` is already overloaded: `"a" + "b"``"ab"`, `1 + 2``3`. Codegen still preserves the existing dispatch for safety, but the runtime fallback is correct |
| **Concurrency** | `pthread`-backed `http_serve` | Single-threaded event loop; `http_serve` not supported in this target |
| **HTTP client** | libcurl, blocking, returns body string | `fetch()` is async — see §5 |
| **CGI identity** | `el_cgi_init` runs at start of `main()` | Not supported; UI code is not a CGI principal |
| **DHARMA / LLM** | Native, blocking, libcurl-backed | Not supported — all such calls throw and the program is expected to delegate to a server-side El daemon via plain HTTP |
| **Compile speed** | El → C → cc → binary (cc is the slow step) | El → JS → done. Faster iteration |
| **Output size** | Static binary ~2MB | Source `.js` + ~10kb runtime |
---
## 5. The async problem (the big deferred decision)
`fetch()` is async. The C backend's `http_get(url)` is synchronous and returns the body string directly. El source was written assuming sync. Three options:
1. **Pretend it's sync from El's POV; use synchronous XHR (browser) or `child_process.execSync('curl …')` (Node).** Bad: synchronous XHR is deprecated and frozen on the main thread; `execSync` is a hack.
2. **Make every `http_*` builtin in the JS runtime return a `Promise`, and rewrite codegen-js to insert `await` everywhere.** This requires turning every El function that transitively calls a network builtin into an `async fn` in JS. Doable, but invasive — the El AST does not currently mark async-ness.
3. **Compile El's call sites with implicit await; compile-time taint tracking marks every fn that transitively calls a network builtin as `async`. Generated JS uses `async function` and `await`.** This is the right answer long-term.
**Decision for this scaffold:** option 3, but only the runtime side is implemented. `http_get` in `el_runtime.js` returns a `Promise<string>`. `codegen-js.el` does NOT yet emit `async`/`await`. Calling `http_get` from compiled El will return a Promise that the El program will treat as a string (which produces `"[object Promise]"`). This is documented and accepted for the scaffold; the compile-time taint pass is a follow-up.
For now, programs that don't touch HTTP work correctly. That covers `el-ui/runtime` (which only manipulates the DOM and a graph), most of cgi-studio's pure UI components, and all hello-world style programs.
---
## 6. Number precision
JS `number` is IEEE 754 double — only 53 bits of integer precision. El `Int` is `int64_t` and the runtime sometimes uses the full 64 bits (e.g. `time_now_utc` returns nanoseconds-since-epoch, which exceeds 2^53 in practice).
**Decision for this scaffold:** accept the precision loss. Document it. UI code does not use 64-bit timestamps. If/when a use case demands it, `time_now_utc` can return a `BigInt` and we can introduce a `BigInt` sub-mode. That's a follow-up.
---
## 7. What's NOT supported in JS target initially
This is the canonical list. Programs that use any of these compile (no `#error`-style fail-fast like the C backend's capability check) but throw at runtime or behave as documented.
| Feature | Status | Notes |
|---|---|---|
| `cgi {}` block | Compiled to a no-op + warning comment | CGI identity is server-side. UI code is not a CGI. |
| `service {}` block | Compiled to a no-op + warning comment | Same. |
| All `dharma_*` | Stub throws | Programs needing DHARMA must call a server-side daemon over HTTP |
| All `engram_*` | Stub throws | Could be ported to in-browser (IndexedDB-backed) later |
| All `llm_*` | Stub throws | Browser cannot hold API keys; route through server |
| `llm_register_tool` | Stub throws | Same |
| `http_serve` | Stub throws | Browsers cannot serve. Node-mode could, deferred |
| `http_set_handler` | Stub throws | Same |
| `match` expressions | Compiled (basic) | LitInt/LitStr/LitBool/Wildcard/Binding all work via `if/else` chain. Tagged-union match deferred |
| `type` (struct) defs | Skipped at codegen | Treated as documentation; structs are plain JS objects. `t["field"]` works |
| `enum` defs | Skipped at codegen | Same — enum values are bare strings or ints |
| `?` postfix (nil-prop) | No-op | Same as C backend's current state |
| `try` postfix | Stripped to inner | Same as C backend |
| Capability enforcement | Not enforced | The C backend uses `#error` directives; the JS backend lets the runtime stubs throw. Future: emit `throw new Error('capability violation')` at compile time |
| VBD role check | Not enforced | Same |
| Float bit-cast | Not needed | JS number is already a double |
| Crypto primitives | Stub throws | Easy to add via `crypto.subtle` later |
| `state_*` | In-memory only | No persistence; resets on page reload |
| `args()` | Node-only | Browser returns `[]` |
| `fs_*` | Node-only | Browser throws |
---
## 8. CLI dispatch — `--target=js`
The compiler entry point `compiler.el` adds a `compile_js(source: String) -> String` alongside the existing `compile()`. The CLI behavior:
```
elc <source.el> <output> # default — emit C
elc --target=c <source.el> <out> # explicit — emit C
elc --target=js <source.el> <out> # emit JS
elc --target=js source.el # write JS to stdout (no out path)
```
The argv parser scans for a `--target=<lang>` token; remaining positional args are `<source>` and optional `<out>`. The dispatch logic stays in El: a `compile_dispatch(target, source) -> String` switch.
---
## 9. The path to compiling el-ui/runtime through this backend
This is the real-world test. `el-ui/runtime/src/` is currently 5 hand-written `.js` files. The path to authoring them in El:
1. **Phase 1 — Hello-world** (this scaffold). Done.
2. **Phase 2 — language coverage.** Get codegen-js to ~95% parity with codegen.el for non-network features. Specifically: `match`, struct/enum field access, `?`-propagation, full `for`-over-list, complete unary/binary operators, lexical closures (the C backend doesn't have these but we'll need them for el-ui's component model).
3. **Phase 3 — DOM bridge.** Add `dom_*` builtins to el_runtime.js: `dom_create_element`, `dom_set_text`, `dom_append_child`, `dom_query`, `dom_listen`, etc. These are Node-as-El builtins for the browser; the C backend will add a stub set that errors. Source-shareable El UI code becomes possible.
4. **Phase 4 — Component class lowering.** El doesn't have classes; el-ui's `Component` is a JS class. Decide: extend El with a `component` keyword that compiles to JS class + C struct? Or have el-ui authors define components as `fn render_<name>(state) -> String` and provide a small bootstrap. The latter is the lower-impact path.
5. **Phase 5 — Async taint pass.** Implement compile-time async tracking so `http_get` and friends produce `await fetch()` correctly. Required before authoring code that fetches data.
6. **Phase 6 — Port `el-ui/runtime/`.** Translate the 5 JS files to El, compile to JS, swap in. Run el-ui's existing tests. Iterate.
7. **Phase 7 — Port cgi-studio UI.** Larger surface area; same pattern.
8. **Phase 8 — Marketplace plugins.** Open the door for third-party UI El.
The blocking item between phase 1 and phase 2 is incremental — every El construct used by el-ui's source needs codegen-js coverage. Phase 5 (async) is the architectural decision that needs explicit user buy-in, because it changes the language's effective semantics on the JS target.
---
## 10. Test
```bash
echo 'fn main() -> Void { println("hello from el-js") }' > /tmp/hello.el
elc --target=js /tmp/hello.el > /tmp/hello.js
node /tmp/hello.js
# → hello from el-js
```
This should pass after the bootstrap rebuild. See §11.
---
## 11. Bootstrap status
Adding `--target=js` to `compile()` requires regenerating the shipped `elc` binary at `dist/platform/elc`. The rebuild path is:
1. Existing `elc` binary compiles updated `elc-combined.el` (which now includes `codegen-js.el` and the `--target=js` dispatch) → `elc.c`.
2. `cc` compiles `elc.c` → new `elc` binary.
3. New `elc` binary supports `--target=js`.
The scaffold checks all four scaffold files in. The bootstrap rebuild happens as a follow-up step, gated on review of this design doc.