Compare commits
4 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 9d0e1f64d4 | |||
| e180baf776 | |||
| 3d71db4958 | |||
| 2a211992d4 |
+979
@@ -0,0 +1,979 @@
|
||||
# El Language Bootstrap Guide
|
||||
|
||||
This document is the authoritative guide for reconstructing the El compiler toolchain from scratch. If the bootstrap binary at `dist/platform/elc` is ever lost, this document is the path back.
|
||||
|
||||
---
|
||||
|
||||
## 1. The Bootstrap Chain (Current State)
|
||||
|
||||
### The Trust Chain
|
||||
|
||||
El is a self-hosting language. The compiler is written in El. This creates a circular dependency: you need an El compiler to compile the El compiler. The chain is resolved by a seed binary:
|
||||
|
||||
```
|
||||
dist/platform/elc (Mach-O arm64 native binary)
|
||||
↓
|
||||
compiles elc-cli.el
|
||||
↓
|
||||
new self-hosted elc binary
|
||||
↓
|
||||
compiles itself again (identity check)
|
||||
↓
|
||||
stable self-hosted compiler
|
||||
```
|
||||
|
||||
The binary at `dist/platform/elc` is a **Mach-O 64-bit arm64 executable**. The `elc.preselfhost` and `elc.legacy` files in the same directory are older snapshots kept as fallback checkpoints.
|
||||
|
||||
The key property: every binary in `dist/platform/` was produced by compiling the El source in `el-compiler/src/` using a previous version of that same binary. The chain is auditable: the source is the ground truth, not the binary.
|
||||
|
||||
### The Self-Hosting Pipeline
|
||||
|
||||
```
|
||||
elc-cli.el
|
||||
imports → el-compiler/src/compiler.el
|
||||
imports → el-compiler/src/lexer.el
|
||||
imports → el-compiler/src/parser.el
|
||||
imports → el-compiler/src/codegen.el
|
||||
imports → el-compiler/src/codegen-js.el
|
||||
```
|
||||
|
||||
Import resolution is textual. `compiler.el` recursively inlines all imported `.el` files before lex/parse. The result is one large unified source string that the compiler then processes in a single pass.
|
||||
|
||||
`elc-combined.el` in the repo root is a pre-merged single-file edition used during early bootstrap iterations.
|
||||
|
||||
### What the Bootstrap Binary Actually Is
|
||||
|
||||
The `dist/platform/elc` binary is a compiled El program that was produced by running an earlier version of itself on `elc-cli.el`. It is not a Rust binary. The `elc.legacy` and `elc.preselfhost` checkpoints suggest the chain has been continuously self-hosting and re-stamped. The original genesis compiler (referenced in the language spec as a "Rust genesis compiler") was used to produce the first self-hosted binary; that Rust binary is not present in this repo.
|
||||
|
||||
To rebuild the current binary from source using the current binary:
|
||||
|
||||
```bash
|
||||
cd /path/to/el
|
||||
./dist/platform/elc elc-cli.el elc-new.c
|
||||
cc -std=c11 -I el-compiler/runtime -lcurl -lpthread \
|
||||
-o dist/platform/elc-new \
|
||||
elc-new.c el-compiler/runtime/el_runtime.c
|
||||
```
|
||||
|
||||
Verify self-hosting by using `elc-new` to recompile itself and diffing the outputs.
|
||||
|
||||
---
|
||||
|
||||
## 2. The Language
|
||||
|
||||
### 2.1 Lexical Structure
|
||||
|
||||
El source is UTF-8. File extension `.el`. Comments are single-line only: `//` to end of line.
|
||||
|
||||
**Token representation:** every token is a map `{ "kind": String, "value": String }`.
|
||||
|
||||
**Keywords** — from `keyword_kind()` in `lexer.el`:
|
||||
|
||||
| Keyword | Token Kind | Notes |
|
||||
|---------|-----------|-------|
|
||||
| `let` | `Let` | variable binding |
|
||||
| `fn` | `Fn` | function definition |
|
||||
| `type` | `Type` | struct definition |
|
||||
| `enum` | `Enum` | enum definition |
|
||||
| `match` | `Match` | pattern match |
|
||||
| `return` | `Return` | function return |
|
||||
| `if` | `If` | conditional |
|
||||
| `else` | `Else` | |
|
||||
| `for` | `For` | iteration |
|
||||
| `in` | `In` | used in `for x in list` |
|
||||
| `while` | `While` | loop |
|
||||
| `import` | `Import` | module import |
|
||||
| `from` | `From` | `from mod import { Name }` |
|
||||
| `as` | `As` | (reserved, no parse form) |
|
||||
| `with` | `With` | (reserved) |
|
||||
| `sealed` | `Sealed` | (reserved) |
|
||||
| `activate` | `Activate` | (reserved) |
|
||||
| `where` | `Where` | (reserved) |
|
||||
| `test` | `Test` | (reserved) |
|
||||
| `seed` | `Seed` | (reserved) |
|
||||
| `assert` | `Assert` | (reserved) |
|
||||
| `protocol` | `Protocol` | (reserved) |
|
||||
| `impl` | `Impl` | (reserved) |
|
||||
| `retry` | `Retry` | reserved / soft keyword in expr position |
|
||||
| `times` | `Times` | reserved / soft keyword |
|
||||
| `fallback` | `Fallback` | reserved / soft keyword |
|
||||
| `reason` | `Reason` | reserved / soft keyword |
|
||||
| `parallel` | `Parallel` | reserved / soft keyword |
|
||||
| `trace` | `Trace` | reserved / soft keyword |
|
||||
| `requires` | `Requires` | reserved / soft keyword |
|
||||
| `deploy` | `Deploy` | reserved / soft keyword |
|
||||
| `to` | `To` | reserved / soft keyword |
|
||||
| `via` | `Via` | reserved / soft keyword |
|
||||
| `target` | `Target` | **RESERVED — cannot use as identifier** |
|
||||
| `true` | `Bool` | literal value `true` |
|
||||
| `false` | `Bool` | literal value `false` |
|
||||
| `cgi` | `Cgi` | CGI identity block |
|
||||
| `service` | `Service` | service declaration block |
|
||||
| `manager` | `Manager` | VBD role decorator / soft keyword |
|
||||
| `engine` | `Engine` | VBD role decorator / soft keyword |
|
||||
| `accessor` | `Accessor` | VBD role decorator / soft keyword |
|
||||
| `vessel` | `Vessel` | soft keyword |
|
||||
| `extern` | `Extern` | `extern fn` forward declaration |
|
||||
|
||||
**Soft keywords** (`target`, `to`, `via`, `deploy`, `reason`, `times`, `fallback`, `retry`, `parallel`, `trace`, `requires`, `where`, `as`, `with`, `manager`, `engine`, `accessor`, `vessel`): these have dedicated token kinds but the parser re-interprets them as `Ident` nodes when they appear in expression position (e.g., as parameter names or local variable names).
|
||||
|
||||
**All token kinds:**
|
||||
|
||||
| Kind | Pattern |
|
||||
|------|---------|
|
||||
| `Int` | `[0-9]+` |
|
||||
| `Float` | `[0-9]+ '.' [0-9]+` |
|
||||
| `Str` | `"…"` with `\"`, `\n`, `\t`, `\r`, `\\` escapes |
|
||||
| `Bool` | `true` or `false` |
|
||||
| `Ident` | `[a-zA-Z_][a-zA-Z0-9_]*` (not a keyword) |
|
||||
| keyword tokens | one per keyword above |
|
||||
| `Eq` | `=` |
|
||||
| `EqEq` | `==` |
|
||||
| `NotEq` | `!=` |
|
||||
| `Not` | `!` |
|
||||
| `Lt` / `LtEq` / `Gt` / `GtEq` | `<` `<=` `>` `>=` |
|
||||
| `And` | `&&` (single `&` is consumed and discarded) |
|
||||
| `Or` | `\|\|` |
|
||||
| `Pipe` | `\|` |
|
||||
| `PipeOp` | `\|>` |
|
||||
| `Plus` / `Minus` / `Star` / `Slash` | `+` `-` `*` `/` |
|
||||
| `Percent` | `%` |
|
||||
| `Arrow` | `->` |
|
||||
| `FatArrow` | `=>` |
|
||||
| `Colon` / `ColonColon` | `:` `::` |
|
||||
| `LParen` / `RParen` | `(` `)` |
|
||||
| `LBrace` / `RBrace` | `{` `}` |
|
||||
| `LBracket` / `RBracket` | `[` `]` |
|
||||
| `Comma` / `Dot` / `Semicolon` | `,` `.` `;` |
|
||||
| `At` | `@` |
|
||||
| `QuestionMark` | `?` |
|
||||
| `Eof` | end-of-input sentinel |
|
||||
|
||||
**String comment stripping:** the lexer contains a special heuristic for string literals that embed JavaScript or CSS (`looks_like_code`). If a string contains `<script`, `<style`, or `function` + `;`, the lexer strips `//` and `/* */` comments from the string value before producing the `Str` token. This is a compile-time content sanitization pass.
|
||||
|
||||
### 2.2 AST Node Types
|
||||
|
||||
Every AST node is a `Map<String, Any>`. The `"expr"` or `"stmt"` key names the node type.
|
||||
|
||||
**Expression nodes:**
|
||||
|
||||
| `expr` value | Fields | Meaning |
|
||||
|-------------|--------|---------|
|
||||
| `Int` | `value: String` | integer literal |
|
||||
| `Float` | `value: String` | float literal |
|
||||
| `Str` | `value: String` | string literal |
|
||||
| `Bool` | `value: String` | `"true"` or `"false"` |
|
||||
| `Nil` | — | null / missing |
|
||||
| `Ident` | `name: String` | identifier reference |
|
||||
| `BinOp` | `op: String`, `left`, `right` | binary operation |
|
||||
| `Not` | `inner` | unary `!` |
|
||||
| `Neg` | `inner` | unary `-` |
|
||||
| `Call` | `func`, `args: [expr]` | function call |
|
||||
| `Field` | `object`, `field: String` | `obj.field` |
|
||||
| `Index` | `object`, `index` | `obj[idx]` |
|
||||
| `Array` | `elems: [expr]` | `[e1, e2, …]` |
|
||||
| `Map` | `pairs: [{ key: String, value: expr }]` | `{ "k": v, … }` |
|
||||
| `If` | `cond`, `then: [stmt]`, `else: [stmt]`, `has_else: Bool` | conditional expression |
|
||||
| `For` | `item: String`, `list`, `body: [stmt]` | for-in expression |
|
||||
| `Match` | `subject`, `arms: [{ pattern, body }]` | pattern match |
|
||||
| `DurationLit` | `count: String`, `unit: String` | `30.seconds`, `1.hour` |
|
||||
| `Try` | `inner` | postfix `?` (no-op passthrough today) |
|
||||
|
||||
**Binary operators** (`op` field values): `Plus`, `Minus`, `Star`, `Slash`, `EqEq`, `NotEq`, `Lt`, `Gt`, `LtEq`, `GtEq`, `And`, `Or`.
|
||||
|
||||
**Operator precedence** (higher = tighter binding):
|
||||
|
||||
| Level | Operators |
|
||||
|-------|-----------|
|
||||
| 6 | `Star`, `Slash` |
|
||||
| 5 | `Plus`, `Minus` |
|
||||
| 4 | `Lt`, `Gt`, `LtEq`, `GtEq` |
|
||||
| 3 | `EqEq`, `NotEq` |
|
||||
| 2 | `And` |
|
||||
| 1 | `Or` |
|
||||
|
||||
**Pattern nodes** (used inside `Match` arms):
|
||||
|
||||
| `pattern` value | Fields | Meaning |
|
||||
|----------------|--------|---------|
|
||||
| `Wildcard` | — | `_` — always matches |
|
||||
| `Binding` | `name: String` | binds subject to name |
|
||||
| `LitInt` | `value: String` | integer literal pattern |
|
||||
| `LitStr` | `value: String` | string literal pattern |
|
||||
| `LitBool` | `value: String` | boolean literal pattern |
|
||||
|
||||
**Statement nodes:**
|
||||
|
||||
| `stmt` value | Fields | Meaning |
|
||||
|-------------|--------|---------|
|
||||
| `Let` | `name: String`, `value: expr`, `type: String` | variable binding |
|
||||
| `Assign` | `name: String`, `value: expr` | bare reassignment `name = expr` |
|
||||
| `Return` | `value: expr` | return statement |
|
||||
| `While` | `cond: expr`, `body: [stmt]` | while loop |
|
||||
| `For` | `item: String`, `list: expr`, `body: [stmt]` | for-in loop |
|
||||
| `FnDef` | `name: String`, `params: [param]`, `body: [stmt]`, `ret_type: String`, `decorator?: String` | function definition |
|
||||
| `ExternFn` | `name: String`, `params: [param]`, `ret_type: String` | forward declaration |
|
||||
| `TypeDef` | `name: String`, `fields: [{ name: String }]` | struct type definition |
|
||||
| `EnumDef` | `name: String`, `variants: [{ name: String }]` | enum definition |
|
||||
| `Import` | `path: String` | `import "file.el"` or `from mod import { … }` |
|
||||
| `CgiBlock` | `name`, `dharma_id`, `principal`, `network`, `engram`, `has_*: Bool` | CGI identity declaration |
|
||||
| `ServiceBlock` | `name`, `sponsor`, `domain` | service declaration |
|
||||
| `Expr` | `value: expr` | bare expression statement |
|
||||
|
||||
**Param nodes:** `{ "name": String, "type": String }` where `type` is the leading identifier of the type annotation (e.g., `"Int"`, `"String"`, `"Map"`) or `""` if unannotated.
|
||||
|
||||
### 2.3 The Type System
|
||||
|
||||
Type annotations are parsed and stored but not type-checked at compile time. They serve as documentation and as hints to the codegen for arithmetic dispatch.
|
||||
|
||||
**Built-in types:**
|
||||
|
||||
| Type | C representation | Notes |
|
||||
|------|-----------------|-------|
|
||||
| `String` | `const char*` cast to `el_val_t` | via `EL_STR()` macro |
|
||||
| `Int` | `int64_t` | direct |
|
||||
| `Bool` | `int64_t` | `0` = false, nonzero = true |
|
||||
| `Float` | `int64_t` | bit-cast double via `el_from_float()` |
|
||||
| `Void` | `void` | functions returning nothing |
|
||||
| `Any` | `void*` cast to `el_val_t` | generic containers |
|
||||
| `[T]` | `el_val_t` | pointer to ElList struct |
|
||||
| `Map<K,V>` | `el_val_t` | pointer to ElMap struct |
|
||||
|
||||
**Temporal types** (first-class in codegen):
|
||||
|
||||
| Type | Representation | Notes |
|
||||
|------|---------------|-------|
|
||||
| `Instant` | nanoseconds since Unix epoch as `int64_t` | `now()` returns this |
|
||||
| `Duration` | signed nanoseconds as `int64_t` | `30.seconds` = `30 * 1000000000` |
|
||||
| `Calendar` | pointer to heap-allocated struct | `earth_calendar(zone)` |
|
||||
| `CalendarTime` | pointer to heap-allocated struct | `now_in(cal)` |
|
||||
| `LocalDate` | pointer to heap-allocated struct | `local_date(y, m, d)` |
|
||||
| `LocalTime` | nanoseconds since midnight, direct `int64_t` | `local_time(h, m, s, ns)` |
|
||||
| `Zone` | pointer to heap-allocated struct | `zone("America/New_York")` |
|
||||
| `Rhythm` | pointer to heap-allocated struct | recurrence pattern |
|
||||
|
||||
The codegen tracks type-annotated variable names in per-function process state (`__int_names`, `__instant_names`, `__duration_names`, etc.) to dispatch arithmetic and comparisons through the correct runtime wrappers. Type-mismatched operations (e.g., `Instant + Instant`) are emitted as `#error` directives.
|
||||
|
||||
**Duration postfix literals:** `30.seconds`, `1.hour`, `500.millis`, `30.nanos` are parsed as `DurationLit` AST nodes and compiled to `el_duration_from_nanos(count * multiplier)`. The multipliers:
|
||||
|
||||
| Unit | Nanoseconds |
|
||||
|------|------------|
|
||||
| `nano` / `nanos` | 1 |
|
||||
| `milli` / `millis` / `millisecond` / `milliseconds` | 1,000,000 |
|
||||
| `second` / `seconds` | 1,000,000,000 |
|
||||
| `minute` / `minutes` | 60,000,000,000 |
|
||||
| `hour` / `hours` | 3,600,000,000,000 |
|
||||
| `day` / `days` | 86,400,000,000,000 |
|
||||
|
||||
### 2.4 Key Language Semantics
|
||||
|
||||
**Implicit return.** The final expression in a function body becomes the return value if it is not a control-flow construct (`If`, `For`). The codegen's `transform_implicit_return` rewrites the last `Expr` statement into a `Return` statement before emitting.
|
||||
|
||||
**Let-rebinding, not mutation.** El uses `let` for both initial binding and rebinding:
|
||||
```el
|
||||
let count = 0
|
||||
let count = count + 1 // NOT mutation — creates a new binding in the same scope
|
||||
```
|
||||
The codegen tracks declared names per C scope. When `count` is already in `declared`, it emits `count = count + 1;` (plain assignment). When it is new, it emits `el_val_t count = 0;`. This means **El does not have mutable variables in the traditional sense** — every `let` is a potential redeclaration. The practical effect is that shadowing and in-place update use identical syntax.
|
||||
|
||||
**Bare reassignment.** The parser also handles `name = expr` (without `let`) when an `Ident` is immediately followed by `Eq`. This emits a plain C assignment.
|
||||
|
||||
**`target` is reserved.** The word `target` is lexed as the `Target` token kind — it cannot be used as a variable or parameter name. Use `tgt` or another name instead. This is a live gotcha in `compiler.el` itself, which uses `tgt` for exactly this reason.
|
||||
|
||||
**`__no_block_expr` guard.** The parser uses process state key `__no_block_expr` to suppress Map-literal parsing when parsing the condition of `if`, `while`, `for`, and `match`. This prevents a stray `{` (the start of the then-block) from being parsed as a Map literal.
|
||||
|
||||
**Arena memory model.** The runtime includes an arena allocator that is activated in server/long-running contexts. In CLI mode (`elc`, `elb`) the arena is inactive. Memory is managed via ARC (reference counting): `el_retain()` and `el_release()` on Lists and Maps. Strings and ints are not refcounted — the retain/release functions are safe no-ops on non-tagged values.
|
||||
|
||||
---
|
||||
|
||||
## 3. The Runtime API
|
||||
|
||||
All runtime functions are declared in `el-compiler/runtime/el_runtime.h`. Every compiled El program links against `el-compiler/runtime/el_runtime.c`.
|
||||
|
||||
All values are `el_val_t` (`int64_t`). Strings are pointers cast through `int64_t` using `EL_STR(s)` / `EL_CSTR(v)` macros.
|
||||
|
||||
Canonical compile command:
|
||||
```bash
|
||||
cc -std=c11 -I el-compiler/runtime -lcurl -lpthread \
|
||||
-o <out> <prog>.c el-compiler/runtime/el_runtime.c
|
||||
```
|
||||
|
||||
### I/O
|
||||
|
||||
| Function | Signature | Description |
|
||||
|----------|-----------|-------------|
|
||||
| `println` | `(s) -> Void` | print string + newline to stdout |
|
||||
| `print` | `(s) -> Void` | print string without newline |
|
||||
| `readline` | `() -> String` | read one line from stdin |
|
||||
|
||||
### String Operations
|
||||
|
||||
| Function | Signature | Description |
|
||||
|----------|-----------|-------------|
|
||||
| `el_str_concat` | `(a, b) -> String` | concatenate two strings |
|
||||
| `str_concat` | `(a, b) -> String` | alias for `el_str_concat` |
|
||||
| `str_eq` | `(a, b) -> Bool` | string equality comparison |
|
||||
| `str_starts_with` | `(s, prefix) -> Bool` | prefix test |
|
||||
| `str_ends_with` | `(s, suffix) -> Bool` | suffix test |
|
||||
| `str_contains` | `(s, sub) -> Bool` | substring test |
|
||||
| `str_len` | `(s) -> Int` | byte length |
|
||||
| `str_slice` | `(s, start, end) -> String` | substring (byte offsets) |
|
||||
| `str_replace` | `(s, from, to) -> String` | replace all occurrences |
|
||||
| `str_to_upper` / `str_upper` | `(s) -> String` | uppercase |
|
||||
| `str_to_lower` / `str_lower` | `(s) -> String` | lowercase |
|
||||
| `str_trim` | `(s) -> String` | strip leading/trailing whitespace |
|
||||
| `str_lstrip` / `str_rstrip` | `(s) -> String` | one-sided strip |
|
||||
| `str_index_of` | `(s, sub) -> Int` | position of substring; `-1` if absent |
|
||||
| `str_last_index_of` | `(s, sub) -> Int` | last position |
|
||||
| `str_index_of_all` | `(s, sub) -> [Int]` | all byte offsets (non-overlapping) |
|
||||
| `str_find_chars` | `(s, any_of) -> Int` | first index of any char in set |
|
||||
| `str_split` | `(s, sep) -> [String]` | split on separator |
|
||||
| `str_split_lines` | `(s) -> [String]` | split on newlines |
|
||||
| `str_split_chars` | `(s) -> [String]` | split into individual characters |
|
||||
| `str_split_n` | `(s, sep, n) -> [String]` | split at most `n` times |
|
||||
| `str_join` | `(list, sep) -> String` | join list with separator |
|
||||
| `str_char_at` | `(s, i) -> String` | character at byte index |
|
||||
| `str_char_code` | `(s, i) -> Int` | Unicode code point at index |
|
||||
| `str_pad_left` | `(s, width, pad) -> String` | left-pad to width |
|
||||
| `str_pad_right` | `(s, width, pad) -> String` | right-pad to width |
|
||||
| `str_format` | `(fmt, data) -> String` | `{key}` interpolation |
|
||||
| `str_repeat` | `(s, n) -> String` | repeat string n times |
|
||||
| `str_reverse` | `(s) -> String` | reverse by codepoint |
|
||||
| `str_strip_prefix` | `(s, prefix) -> String` | remove prefix if present |
|
||||
| `str_strip_suffix` | `(s, suffix) -> String` | remove suffix if present |
|
||||
| `str_strip_chars` | `(s, chars) -> String` | strip characters from both ends |
|
||||
| `str_count` | `(s, sub) -> Int` | count non-overlapping occurrences |
|
||||
| `str_count_chars` | `(s) -> Int` | codepoint count |
|
||||
| `str_count_bytes` | `(s) -> Int` | alias for `str_len` |
|
||||
| `str_count_lines` | `(s) -> Int` | line count |
|
||||
| `str_count_words` | `(s) -> Int` | word count |
|
||||
| `str_count_letters` | `(s) -> Int` | ASCII letter count |
|
||||
| `str_count_digits` | `(s) -> Int` | ASCII digit count |
|
||||
| `is_letter` / `is_digit` / `is_alphanumeric` | `(s) -> Bool` | ASCII char classification |
|
||||
| `is_whitespace` / `is_punctuation` | `(s) -> Bool` | |
|
||||
| `is_uppercase` / `is_lowercase` | `(s) -> Bool` | |
|
||||
| `int_to_str` | `(n) -> String` | format integer |
|
||||
| `str_to_int` | `(s) -> Int` | parse integer |
|
||||
| `str_to_float` | `(s) -> Float` | parse float |
|
||||
| `parse_int` | `(s, default) -> Int` | parse with fallback |
|
||||
| `bool_to_str` | `(b) -> String` | format bool |
|
||||
|
||||
### Integer/Float Math
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `el_abs(n)` | absolute value |
|
||||
| `el_max(a, b)` | maximum |
|
||||
| `el_min(a, b)` | minimum |
|
||||
| `float_to_str(f)` | format float as string |
|
||||
| `int_to_float(n)` | widen Int to Float |
|
||||
| `float_to_int(f)` | truncate Float to Int |
|
||||
| `format_float(f, decimals)` | format with N decimal places |
|
||||
| `decimal_round(f, decimals)` | round to N decimals |
|
||||
| `math_sqrt(f)` | square root |
|
||||
| `math_log(f)` / `math_ln(f)` | logarithms |
|
||||
| `math_sin(f)` / `math_cos(f)` / `math_pi()` | trigonometry |
|
||||
|
||||
### List Operations
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `el_list_empty()` | create empty list |
|
||||
| `el_list_new(count, …)` | create list from N values (varargs) |
|
||||
| `el_list_len(list)` | length |
|
||||
| `el_list_get(list, i)` | element at index; `0` on out-of-bounds |
|
||||
| `el_list_append(list, e)` | append; returns updated list |
|
||||
| `el_list_clone(list)` | shallow copy |
|
||||
| `list_push(list, e)` | alias for `el_list_append` |
|
||||
| `list_push_front(list, e)` | prepend |
|
||||
| `list_join(list, sep)` | join to string |
|
||||
| `list_range(start, end)` | integer range `[start, end)` |
|
||||
| `native_list_empty()` | alias for `el_list_empty` (used in compiler source) |
|
||||
| `native_list_append(l, v)` | alias for `el_list_append` |
|
||||
| `native_list_get(l, idx)` | alias for `el_list_get` |
|
||||
| `native_list_len(l)` | alias for `el_list_len` |
|
||||
| `native_list_clone(l)` | alias for `el_list_clone` |
|
||||
| `append(l, e)` | method-call alias: `list.append(e)` |
|
||||
| `len(l)` | method-call alias: `list.len()` |
|
||||
| `get(l, i)` | method-call alias: `list.get(i)` |
|
||||
|
||||
### Map Operations
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `el_map_new(count, …)` | create map from key/value pairs (varargs) |
|
||||
| `el_map_get(map, key)` | get value by key |
|
||||
| `el_map_set(map, key, value)` | set key; returns map |
|
||||
| `el_get_field(map, key)` | alias; emitted for `.field` access |
|
||||
| `map_get(map, key)` | method-call alias |
|
||||
| `map_set(map, key, value)` | method-call alias |
|
||||
|
||||
### ARC (Reference Counting)
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `el_retain(v)` | increment refcount; no-op for non-heap values |
|
||||
| `el_release(v)` | decrement refcount; free when zero |
|
||||
|
||||
### In-Process State
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `state_set(key, value)` | store in process-global key/value table |
|
||||
| `state_get(key)` | retrieve; `""` if absent |
|
||||
| `state_del(key)` | delete key |
|
||||
| `state_keys()` | all keys as `[String]` |
|
||||
|
||||
### Filesystem
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `fs_read(path)` | read file to string; `""` on error |
|
||||
| `fs_write(path, content)` | write string; returns `1` on success |
|
||||
| `fs_write_bytes(path, bytes, length)` | write raw bytes of known length |
|
||||
| `fs_list(path)` | list directory entries |
|
||||
| `fs_exists(path)` | check if path exists |
|
||||
| `fs_mkdir(path)` | mkdir -p |
|
||||
|
||||
### HTTP Client
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `http_get(url)` | GET; returns body string |
|
||||
| `http_post(url, body)` | POST; returns body string |
|
||||
| `http_post_json(url, json_body)` | POST with Content-Type: application/json |
|
||||
| `http_get_with_headers(url, headers_map)` | GET with custom headers |
|
||||
| `http_post_with_headers(url, body, headers_map)` | POST with custom headers |
|
||||
| `http_post_form_auth(url, form_body, auth_header)` | POST with auth |
|
||||
| `http_delete(url)` | DELETE |
|
||||
| `http_get_to_file(url, headers_map, output_path)` | stream response to file |
|
||||
| `http_post_to_file(url, body, headers_map, output_path)` | stream POST response to file |
|
||||
| `http_response(status, headers_json, body)` | build response envelope |
|
||||
| `url_encode(s)` | RFC 3986 percent-encoding |
|
||||
| `url_decode(s)` | URL decode |
|
||||
| `el_html_sanitize(html, allowlist_json)` | allowlist HTML sanitizer |
|
||||
|
||||
### HTTP Server
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `http_serve(port, handler)` | start server; handler: `(method, path, body) -> String` |
|
||||
| `http_serve_v2(port, handler)` | start server; handler: `(method, path, headers_map, body) -> String` |
|
||||
| `http_set_handler(name)` | set handler by symbol name |
|
||||
| `http_set_handler_v2(name)` | v2 variant |
|
||||
|
||||
### JSON
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `json_get(json, key)` | substring lookup of `"key": value` |
|
||||
| `json_parse(s)` | parse JSON string to List/Map |
|
||||
| `json_stringify(v)` | serialize Any to JSON string |
|
||||
| `json_get_string(j, key)` | typed extract: String |
|
||||
| `json_get_int(j, key)` | typed extract: Int |
|
||||
| `json_get_float(j, key)` | typed extract: Float |
|
||||
| `json_get_bool(j, key)` | typed extract: Bool |
|
||||
| `json_get_raw(j, key)` | extract nested object/array as JSON string |
|
||||
| `json_set(j, key, value)` | update field, return new JSON string |
|
||||
| `json_array_len(j)` | length of JSON array string |
|
||||
| `json_array_get(j, index)` | element at index |
|
||||
| `json_array_get_string(j, index)` | string element at index |
|
||||
|
||||
### Time (Epoch-Based)
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `time_now()` | Unix epoch milliseconds |
|
||||
| `time_now_utc()` | same, explicit UTC |
|
||||
| `time_format(ts, fmt)` | format timestamp |
|
||||
| `time_to_parts(ts)` | decompose to Map of fields |
|
||||
| `time_from_parts(secs, ns, tz)` | construct timestamp |
|
||||
| `time_add(ts, n, unit)` | add duration |
|
||||
| `time_diff(ts1, ts2, unit)` | difference |
|
||||
| `unix_timestamp()` | Unix seconds as Int |
|
||||
| `sleep_secs(secs)` | sleep N seconds |
|
||||
| `sleep_ms(ms)` | sleep N milliseconds |
|
||||
|
||||
### Time (First-Class Instant/Duration)
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `now()` / `el_now_instant()` | current time as Instant (nanoseconds) |
|
||||
| `unix_seconds(n)` | construct Instant from Unix seconds |
|
||||
| `unix_millis(n)` | construct Instant from Unix milliseconds |
|
||||
| `instant_from_iso8601(s)` | parse ISO 8601 string |
|
||||
| `instant_to_unix_seconds(i)` | extract Unix seconds |
|
||||
| `instant_to_unix_millis(i)` | extract Unix milliseconds |
|
||||
| `instant_to_iso8601(i)` | format as ISO 8601 |
|
||||
| `el_duration_from_nanos(ns)` | construct Duration from nanoseconds |
|
||||
| `duration_seconds(n)` | Duration from seconds |
|
||||
| `duration_millis(n)` | Duration from milliseconds |
|
||||
| `duration_nanos(n)` | Duration from nanoseconds |
|
||||
| `duration_to_seconds(d)` | extract seconds |
|
||||
| `duration_to_millis(d)` | extract milliseconds |
|
||||
| `duration_to_nanos(d)` | extract nanoseconds |
|
||||
| `el_instant_add_dur(inst, dur)` | Instant + Duration |
|
||||
| `el_instant_sub_dur(inst, dur)` | Instant - Duration |
|
||||
| `el_instant_diff(a, b)` | Instant - Instant = Duration |
|
||||
| `el_duration_add/sub/scale/div` | Duration arithmetic |
|
||||
| `el_instant_lt/le/gt/ge/eq/ne` | Instant comparison |
|
||||
| `el_duration_lt/le/gt/ge/eq/ne` | Duration comparison |
|
||||
| `el_sleep_duration(dur)` | sleep for a Duration |
|
||||
| `ttl_cache_set(key, value)` | store with TTL |
|
||||
| `ttl_cache_get(key, max_age)` | retrieve if within max_age |
|
||||
| `ttl_cache_age(key)` | age of cached value as Duration |
|
||||
|
||||
### Calendar System
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `zone(id)` | IANA zone or fixed offset |
|
||||
| `zone_utc()` / `zone_local()` | UTC and local zone |
|
||||
| `zone_offset(hours, minutes)` | fixed offset zone |
|
||||
| `earth_calendar(z)` | Gregorian calendar in zone |
|
||||
| `earth_calendar_default()` | system default |
|
||||
| `mars_calendar()` / `cycle_calendar(period)` | non-Earth calendars |
|
||||
| `no_cycle_calendar()` / `relative_calendar(epoch)` | abstract calendars |
|
||||
| `now_in(cal)` | current time as CalendarTime |
|
||||
| `in_calendar(inst, cal)` | project Instant into Calendar |
|
||||
| `cal_format(ct, pattern)` | format CalendarTime |
|
||||
| `cal_to_instant(ct)` | extract underlying Instant |
|
||||
| `cal_cycle_phase(ct)` / `cal_in(ct, cal)` | calendar ops |
|
||||
| `local_date(y, m, d)` | construct LocalDate |
|
||||
| `local_time(h, m, s, ns)` | construct LocalTime |
|
||||
| `local_datetime(date, time)` | construct LocalDateTime |
|
||||
| `zoned(date, time, cal)` | zoned datetime |
|
||||
| `local_date_year/month/day` | LocalDate accessors |
|
||||
| `local_time_hour/minute/second/nanos` | LocalTime accessors |
|
||||
| `el_local_date_add_dur` / `el_local_time_add_dur` | date/time arithmetic |
|
||||
| `el_local_date_lt` / `el_local_date_eq` | date comparison |
|
||||
| `rhythm_*` | recurrence patterns (cycle_start, weekday, weekly_at, next_after, matches, …) |
|
||||
|
||||
### Process / Execution
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `args()` | command-line arguments as `[String]` (excludes argv[0]) |
|
||||
| `env(key)` | read environment variable; `""` if unset |
|
||||
| `exit(code)` | exit process with code |
|
||||
| `exit_program(code)` | alias for `exit` |
|
||||
| `getpid_now()` | current process ID |
|
||||
| `exec_command(cmd)` | run shell command; return exit code |
|
||||
| `exec_capture(cmd)` | run shell command; capture and return stdout |
|
||||
| `uuid_new()` / `uuid_v4()` | generate UUID v4 |
|
||||
| `native_int_to_str(n)` | format integer (alias, used in compiler source) |
|
||||
| `native_string_chars(s)` | split string into `[String]` of single characters |
|
||||
|
||||
### Crypto
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `sha256_hex(input)` | SHA-256, hex output |
|
||||
| `sha256_bytes(input)` | SHA-256, raw bytes |
|
||||
| `hmac_sha256_hex(key, msg)` | HMAC-SHA-256, hex |
|
||||
| `hmac_sha256_bytes(key, msg)` | HMAC-SHA-256, raw bytes |
|
||||
| `base64_encode(input)` / `base64_decode(input)` | standard base64 |
|
||||
| `base64url_encode(input)` / `base64url_decode(input)` | URL-safe base64 |
|
||||
| `sha3_256_hex(input)` | SHA3-256 (Keccak) |
|
||||
| `pq_keygen_signature()` | Dilithium-3 key pair |
|
||||
| `pq_sign(sk_hex, msg)` / `pq_verify(pk_hex, msg, sig_hex)` | PQ signatures |
|
||||
| `pq_kem_keygen()` / `pq_kem_encaps(pk)` / `pq_kem_decaps(sk, ct)` | Kyber-768 KEM |
|
||||
| `pq_hybrid_keygen()` / `pq_hybrid_handshake(remote_pub)` | X25519 + Kyber hybrid |
|
||||
| `aead_encrypt(key_hex, plaintext)` | AES-256-GCM encrypt |
|
||||
| `aead_decrypt(key_hex, nonce_hex, ct_hex)` | AES-256-GCM decrypt |
|
||||
|
||||
### DHARMA Network (CGI programs only)
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `el_cgi_init(name, dharma_id, principal, network, engram)` | initialize CGI identity (called by generated `main()`) |
|
||||
| `dharma_connect(cgi_id)` | open channel to peer |
|
||||
| `dharma_send(channel, content)` | send message; blocks for response |
|
||||
| `dharma_activate(query)` | spreading activation across DHARMA network |
|
||||
| `dharma_emit(event_type, payload)` | emit network event (@manager only) |
|
||||
| `dharma_field(event_type)` | wait for event (@manager only) |
|
||||
| `dharma_strengthen(cgi_id, weight)` | Hebbian potentiation |
|
||||
| `dharma_relationship(cgi_id)` | current relationship weight |
|
||||
| `dharma_peers()` | all connected peers sorted by weight |
|
||||
|
||||
### Engram Knowledge Graph
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `engram_node(content, type, salience)` | create node; returns ID |
|
||||
| `engram_node_full(content, type, label, salience, importance, confidence, tier, tags)` | full node creation |
|
||||
| `engram_node_layered(…, layer_id)` | create node in specific layer |
|
||||
| `engram_get_node(id)` | retrieve node by ID |
|
||||
| `engram_strengthen(node_id)` | Hebbian potentiation |
|
||||
| `engram_forget(node_id)` | delete node and edges |
|
||||
| `engram_node_count()` | total node count |
|
||||
| `engram_edge_count()` | total edge count |
|
||||
| `engram_search(query, limit)` | full-text search |
|
||||
| `engram_scan_nodes(limit, offset)` | paginated node scan |
|
||||
| `engram_connect(from, to, weight, relation)` | create directed edge |
|
||||
| `engram_edge_between(from, to)` | get edge |
|
||||
| `engram_neighbors(node_id)` | BFS neighbors |
|
||||
| `engram_neighbors_filtered(node_id, max_depth, direction)` | filtered BFS |
|
||||
| `engram_activate(query, depth)` | spreading activation |
|
||||
| `engram_save(path)` / `engram_load(path)` | snapshot to/from disk |
|
||||
| `engram_add_layer(name, priority, suppressible, transparent, injectable)` | add consciousness layer |
|
||||
| `engram_remove_layer(layer_id)` / `engram_list_layers()` | layer management |
|
||||
| `engram_*_json` variants | JSON-string versions of search/scan/activate |
|
||||
| `engram_compile_layered_json(intent, depth)` | prompt-ready context block |
|
||||
|
||||
### LLM (Anthropic API)
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `llm_call(model, prompt)` | single-turn call |
|
||||
| `llm_call_system(model, system, user)` | call with system prompt |
|
||||
| `llm_call_agentic(model, system, user, tools)` | agentic call with tools (CGI only) |
|
||||
| `llm_vision(model, system, prompt, image)` | vision call |
|
||||
| `llm_models()` | list available models |
|
||||
| `llm_register_tool(name, handler_fn_name)` | register tool handler (CGI only) |
|
||||
|
||||
### Observability
|
||||
|
||||
| Function | Description |
|
||||
|----------|-------------|
|
||||
| `emit_log(level, msg, fields_json)` | emit OTLP log |
|
||||
| `emit_metric(name, value, tags_json)` | emit OTLP metric |
|
||||
| `trace_span_start(name)` | start trace span |
|
||||
| `trace_span_end(span_handle)` | end trace span |
|
||||
| `emit_event(name, duration_ms)` | emit event |
|
||||
|
||||
---
|
||||
|
||||
## 4. How to Re-Bootstrap from Zero
|
||||
|
||||
This section assumes the bootstrap binary is gone. Everything else (source files, runtime) is intact.
|
||||
|
||||
### What You Need to Implement
|
||||
|
||||
A minimal El compiler has three parts: lexer, parser, codegen. Each can be written in any language. The goal is to compile `elc-cli.el` into a working `elc` binary, after which El is self-hosting again.
|
||||
|
||||
### Step 1: Write a Minimal Lexer
|
||||
|
||||
The lexer must produce a list of `{ "kind": String, "value": String }` maps (or equivalent structures). Required token kinds: `Int`, `Float`, `Str`, `Bool`, `Ident`, `Eof`, and all keywords and operators listed in section 2.1.
|
||||
|
||||
The minimal subset needed to compile the compiler itself:
|
||||
- Keywords: `let`, `fn`, `return`, `if`, `else`, `while`, `for`, `in`, `import`, `from`, `true`, `false`, `extern`
|
||||
- Literals: `Int`, `Str`, `Bool`, `Ident`
|
||||
- Operators: `=`, `==`, `!=`, `!`, `<`, `>`, `<=`, `>=`, `&&`, `||`, `+`, `-`, `*`, `/`, `->`, `=>`, `:`, `,`, `.`, `(`, `)`, `{`, `}`, `[`, `]`, `@`, `?`
|
||||
- Special: `Eof`
|
||||
|
||||
The lexer in `lexer.el` walks a char array using `native_list_get` to avoid O(n²) string slicing. A Python implementation can use a simple index into a string. Escapes to handle: `\"`, `\n`, `\t`, `\r`, `\\`.
|
||||
|
||||
### Step 2: Write a Minimal Parser
|
||||
|
||||
The parser is a standard recursive descent parser. It produces AST maps as described in section 2.2.
|
||||
|
||||
The minimal statement forms needed to compile the compiler:
|
||||
- `let name [: Type] = expr`
|
||||
- `fn name(params) [-> Type] { body }`
|
||||
- `extern fn name(params) [-> Type]`
|
||||
- `return expr`
|
||||
- `while cond { body }`
|
||||
- `for item in list { body }`
|
||||
- `if cond { body } [else [if] { body }]`
|
||||
- `import "path"`
|
||||
- `from module import { … }`
|
||||
- `@decorator stmt`
|
||||
- `name = expr` (bare assignment)
|
||||
- bare expression statement
|
||||
|
||||
The minimal expression forms:
|
||||
- Integer, float, string, bool literals
|
||||
- Identifier
|
||||
- Binary operations with the precedence table from section 2.2
|
||||
- Unary `!` and `-`
|
||||
- Function call: `f(a, b, …)`
|
||||
- Method call: `obj.method(args)` (parsed as Call with Field func)
|
||||
- Field access: `obj.field`
|
||||
- Index access: `obj[i]`
|
||||
- Array literal: `[e1, e2, …]`
|
||||
- Map literal: `{ "key": value, … }`
|
||||
- `if` as expression
|
||||
- `match` expression
|
||||
- Postfix `?` (can be a no-op)
|
||||
- Duration literal: `N.unit`
|
||||
|
||||
The `__no_block_expr` guard (section 2.4) is important: without it, `if a || b { ... }` will incorrectly parse `{` as a Map literal.
|
||||
|
||||
### Step 3: Write a Minimal Codegen
|
||||
|
||||
The codegen emits C11 source. Required output structure:
|
||||
|
||||
```c
|
||||
#include <stdint.h>
|
||||
#include <stdlib.h>
|
||||
#include "el_runtime.h"
|
||||
|
||||
// Forward declarations for all non-main functions
|
||||
el_val_t fn_name(el_val_t p1, el_val_t p2);
|
||||
...
|
||||
|
||||
// File-scope let bindings (if any)
|
||||
el_val_t GLOBAL_NAME;
|
||||
|
||||
// Function bodies
|
||||
el_val_t fn_name(el_val_t p1, el_val_t p2) {
|
||||
...
|
||||
return 0;
|
||||
}
|
||||
|
||||
// Entry point
|
||||
int main(int _argc, char** _argv) {
|
||||
el_runtime_init_args(_argc, _argv);
|
||||
...
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
Critical codegen rules:
|
||||
|
||||
1. **All values are `el_val_t`**. Every parameter, local variable, and return type is `el_val_t` unless the function has `ret_type == "Void"` (use `void`).
|
||||
|
||||
2. **Let-rebinding**: track declared names per C scope. Emit `el_val_t name = val;` on first occurrence; emit `name = val;` on subsequent occurrences of the same name in the same scope.
|
||||
|
||||
3. **`+` dispatch**: if either operand is a string literal → `el_str_concat(a, b)`. If both are provably integers → `(a + b)`. Default fallback → `el_str_concat`.
|
||||
|
||||
4. **`==` dispatch**: if either operand is a string or identifier → `str_eq(a, b)`. If both are integer literals or provably Int → `(a == b)`.
|
||||
|
||||
5. **String literals**: wrap in `EL_STR("…")` and escape: `\"` → `\\\"`, `\n` → `\\n`, `\t` → `\\t`, `\\` → `\\\\`.
|
||||
|
||||
6. **Map literals**: `el_map_new(N, "k1", v1, "k2", v2, …)`. Empty map: `el_map_new(0)`.
|
||||
|
||||
7. **Array literals**: `el_list_new(N, e1, e2, …)`. Empty: `el_list_empty()`.
|
||||
|
||||
8. **Index access**: string-literal index → `el_get_field(obj, EL_STR("key"))`. Integer index → `el_list_get(obj, idx)`.
|
||||
|
||||
9. **Field access** `obj.field` → `el_get_field(obj, EL_STR("field"))`.
|
||||
|
||||
10. **Method call** `obj.method(args)` → `method(obj, args)`.
|
||||
|
||||
11. **`for item in list`** → emit:
|
||||
```c
|
||||
{ el_val_t _el_lst = <list>; el_val_t _el_len = el_list_len(_el_lst);
|
||||
for (el_val_t _el_i = 0; _el_i < _el_len; _el_i++) {
|
||||
el_val_t item = el_list_get(_el_lst, _el_i);
|
||||
<body>
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
12. **`match`** → GCC/Clang statement expression with `goto`:
|
||||
```c
|
||||
({ el_val_t _s = <subject>; el_val_t _r = 0;
|
||||
if (_s == 42) { _r = <arm_body>; goto _done; }
|
||||
if (str_eq(_s, EL_STR("str"))) { _r = <arm_body>; goto _done; }
|
||||
{ _r = <wildcard_body>; goto _done; }
|
||||
_done:; _r; })
|
||||
```
|
||||
|
||||
13. **`if` as expression** → similarly wrapped in a GCC/Clang statement expression.
|
||||
|
||||
14. **Implicit return**: if the last statement in a function body is a bare `Expr` (not `If` or `For`), emit it as `return <expr>;` instead of `<expr>;`.
|
||||
|
||||
15. **Float literals**: emit as `el_from_float(<value>)`.
|
||||
|
||||
16. **Bool literals**: `true` → `1`, `false` → `0`.
|
||||
|
||||
17. **`fn main()`**: do not emit as a regular `el_val_t` function. Instead, fold its body into C's `int main()` after any top-level statements.
|
||||
|
||||
18. **`extern fn`**: emit only a forward declaration (no body).
|
||||
|
||||
19. **Forward declarations**: scan for all `FnDef` nodes before emitting bodies. This enables mutual recursion.
|
||||
|
||||
### Step 4: Compile the El Compiler
|
||||
|
||||
Using your minimal implementation, compile `elc-cli.el` (which imports the entire compiler chain):
|
||||
|
||||
```bash
|
||||
# Your minimal compiler
|
||||
python3 minimal_elc.py elc-cli.el > elc-new.c
|
||||
|
||||
# Build with the runtime
|
||||
cc -std=c11 -I el-compiler/runtime -lcurl -lpthread \
|
||||
-o elc-new elc-new.c el-compiler/runtime/el_runtime.c
|
||||
```
|
||||
|
||||
### Step 5: Verify Self-Hosting
|
||||
|
||||
```bash
|
||||
# Compile elc-cli.el with the new compiler
|
||||
./elc-new elc-cli.el elc-v2.c
|
||||
cc -std=c11 -I el-compiler/runtime -lcurl -lpthread \
|
||||
-o elc-v2 elc-v2.c el-compiler/runtime/el_runtime.c
|
||||
|
||||
# Compile again with the second-generation compiler
|
||||
./elc-v2 elc-cli.el elc-v3.c
|
||||
|
||||
# The outputs should be identical
|
||||
diff elc-v2.c elc-v3.c
|
||||
```
|
||||
|
||||
A clean diff confirms you have a stable fixed point: the compiler reproduces itself exactly.
|
||||
|
||||
### Step 6: Replace the Bootstrap Binary
|
||||
|
||||
```bash
|
||||
cp elc-v2 dist/platform/elc
|
||||
```
|
||||
|
||||
You are bootstrapped.
|
||||
|
||||
### Minimal El Subset for the Compiler Itself
|
||||
|
||||
The El compiler source (`lexer.el`, `parser.el`, `codegen.el`, `compiler.el`) uses:
|
||||
- `fn`, `let`, `while`, `if`/`else`, `return`, `for`/`in`, `import`
|
||||
- `extern fn` (for `.elh` headers)
|
||||
- `String`, `Int`, `Bool`, `Void`, `Any`, `Map<String, Any>`, `[String]`, `[Map<String, Any>]`
|
||||
- Map literals `{ "key": val }`
|
||||
- Array literals `[...]` (and `native_list_empty()`)
|
||||
- List operations: `native_list_empty()`, `native_list_append()`, `native_list_get()`, `native_list_len()`, `native_list_clone()`
|
||||
- String operations: `str_join()`, `str_eq()`, `str_contains()`, `str_starts_with()`, `str_slice()`, `str_trim()`, `str_split()`, `str_index_of()`, `str_len()`, `str_to_int()`, `native_string_chars()`, `native_int_to_str()`
|
||||
- `state_get()`, `state_set()`
|
||||
- `println()`, `fs_read()`, `fs_write()`, `exit()`
|
||||
- `el_release()` (ARC cleanup)
|
||||
|
||||
The compiler does not use: HTTP, engram, dharma, LLM, crypto, UUID, float arithmetic.
|
||||
|
||||
---
|
||||
|
||||
## 5. The Long-Term Solution: elvm
|
||||
|
||||
### Why a VM Makes Bootstrapping More Auditable
|
||||
|
||||
The current bootstrap chain relies on trusting a binary whose source we cannot fully audit by inspection alone. This is the classic "trusting trust" problem (Ken Thompson, 1984). A virtual machine breaks the chain:
|
||||
|
||||
- `elc` targets `elvm` bytecode (instead of C)
|
||||
- `elvm` is a minimal interpreter hand-written in ~500 lines of C
|
||||
- The hand-written C is small enough to audit completely
|
||||
- Anyone can compile `elvm.c` with any C compiler
|
||||
- From there: `elvm` interprets `elc.elvm` → `elc` compiles El → `cc` builds native binaries
|
||||
|
||||
The benefit: the trusted base shrinks from "a Mach-O binary" to "500 lines of straightforward C code that anyone can read in an afternoon."
|
||||
|
||||
### The elvm Design
|
||||
|
||||
A minimal elvm needs:
|
||||
- A stack or register machine (stack is simpler)
|
||||
- Instructions: push, pop, add, sub, mul, div, cmp, jump, call, return, load, store
|
||||
- A string table (El strings are mostly literals)
|
||||
- A heap for ElList and ElMap
|
||||
- An FFI table mapping El runtime builtins to C functions
|
||||
|
||||
The El compiler would gain a `--target=elvm` flag in `compile_dispatch()`. Codegen would emit bytecode instead of C text. The runtime interface stays the same — builtins map to FFI slots by name.
|
||||
|
||||
This is the planned path. It does not exist yet.
|
||||
|
||||
---
|
||||
|
||||
## 6. Compiler Source Map
|
||||
|
||||
| File | Role | Lines |
|
||||
|------|------|-------|
|
||||
| `elc-cli.el` | Entry point; imports compiler.el | 7 |
|
||||
| `el-compiler/src/compiler.el` | Pipeline wiring: lex → parse → codegen. Import resolution, `--emit-header`, `fn main()`. Defines `compile()`, `compile_js()`, `compile_dispatch()`, `resolve_imports()` | 298 |
|
||||
| `el-compiler/src/lexer.el` | Tokenizer. `lex(source)` → token list. Char helpers, keyword lookup, scan_digits, scan_ident, scan_string, strip_code_comments | 747 |
|
||||
| `el-compiler/src/parser.el` | Recursive descent parser. `parse(tokens)` → AST. All statement and expression forms | 1071 |
|
||||
| `el-compiler/src/codegen.el` | C code emitter. `codegen(stmts, source)` → (streams to stdout). Expression codegen, statement codegen, function codegen, type tracking, capability enforcement, temporal type dispatch | 2721 |
|
||||
| `el-compiler/src/codegen-js.el` | JavaScript backend. `codegen_js(stmts, source)` → JS source | ~500 |
|
||||
| `el-compiler/runtime/el_runtime.h` | Full runtime API declaration | 755 |
|
||||
| `el-compiler/runtime/el_runtime.c` | Full runtime implementation | large |
|
||||
| `el-compiler/runtime/el_runtime.js` | JS runtime | — |
|
||||
| `elb.el` | Build coordinator. Reads `manifest.el`, walks import graph, compiles modules, links binary. The `.NET`-style incremental build model | 367 |
|
||||
| `elc-combined.el` | Pre-merged single-file bootstrap edition (for early bootstrap iterations) | large |
|
||||
| `spec/language.md` | Language specification v1.2.0 | — |
|
||||
| `dist/platform/elc` | Current bootstrap binary (Mach-O arm64) | — |
|
||||
|
||||
---
|
||||
|
||||
## 7. Key Decisions and Gotchas
|
||||
|
||||
### `target` is a Reserved Keyword
|
||||
|
||||
`target` is lexed as the `Target` token kind. It cannot be used as a variable or parameter name anywhere in El source. If you write `fn compile(target: String)`, the parameter name will be tokenized as `Target`, which the parser does not recognize as an `Ident` in parameter position.
|
||||
|
||||
**Workaround:** use `tgt`, `dest`, `backend`, or any other name. The compiler source uses `tgt` specifically for this reason. This comes up whenever writing code that handles compilation targets.
|
||||
|
||||
### `let x = x + 1` is Let-Rebinding, Not Mutation
|
||||
|
||||
El has no mutable variables. `let count = count + 1` re-introduces `count` into the current scope, shadowing the previous binding. At the C level, the codegen tracks declared names and emits plain assignment for subsequent bindings of the same name:
|
||||
|
||||
- First `let count = 0` → `el_val_t count = 0;`
|
||||
- Second `let count = count + 1` → `count = count + 1;`
|
||||
|
||||
This means you cannot have two different values named `count` in the same C scope — the second binding overwrites the first. This is by design. Scoped shadowing works correctly because each block (if body, while body, for body) gets its own copy of the `declared` list.
|
||||
|
||||
### Arena is Inactive in CLI Mode
|
||||
|
||||
The runtime includes an arena allocator designed for long-running server processes. In CLI mode (`elc`, `elb`) the arena is not activated. Memory is managed by ARC (reference counting via `el_retain`/`el_release`). The compiler source explicitly calls `el_release(tokens)` after parsing and `el_release(stmt)` after codegen to prevent memory exhaustion on large source files.
|
||||
|
||||
If you are implementing a new runtime or embedding El, be aware that the ARC model expects callers to release values they are done with.
|
||||
|
||||
### The `extern fn` / `.elh` Separate Compilation Model
|
||||
|
||||
`elb` (the build coordinator) supports separate compilation. When a module changes:
|
||||
1. `elc --emit-header module.el module.c` compiles the module and writes `module.elh`
|
||||
2. `module.elh` contains `extern fn` declarations for all public functions
|
||||
3. Other modules that import `module.el` use the `.elh` header instead of re-parsing the source
|
||||
|
||||
The `resolve_imports` function in `compiler.el` checks for a `.elh` file before recursively inlining the `.el` source. If the header exists, it is used (and the `.el` is marked as seen to prevent double-inclusion).
|
||||
|
||||
This is important for bootstrap: if you have pre-compiled headers lying around from a broken build, they may shadow updated source. Delete `.elh` files (or use `elb --clean`) when debugging unexpected compilation behavior.
|
||||
|
||||
### Import Resolution: Depth-First with Deduplication
|
||||
|
||||
`resolve_imports` in `compiler.el`:
|
||||
|
||||
1. Walks imports depth-first (dependencies before dependents)
|
||||
2. Uses `state_set("__elc_imp__:" + path, "1")` to deduplicate: each file is included exactly once
|
||||
3. Builds the combined source string by concatenating import bodies ahead of the entry file's body
|
||||
4. If a `.elh` header exists for an import, uses that instead of recursing into the `.el`
|
||||
|
||||
The result is one large string that gets passed through `lex` → `parse` → `codegen` as a single unit. The codegen emits forward declarations for all functions before any body, so declaration order within the combined source does not matter.
|
||||
|
||||
### `+` Operator Dispatch is Heuristic
|
||||
|
||||
El's `+` operator serves double duty: integer addition and string concatenation. The codegen dispatches based on static analysis of the AST:
|
||||
|
||||
- If either operand is a `Str` literal → `el_str_concat`
|
||||
- If both operands are provably `Int` (via `is_int_expr`) → `(a + b)`
|
||||
- If either operand is a `Call` or `Ident` → `el_str_concat` (conservative fallback)
|
||||
|
||||
The `is_int_expr` predicate recurses through the AST: literal `Int`, names in `__int_names` (from `: Int` annotations), known Int-returning builtins, and arithmetic BinOps over Int operands all count as "provably Int."
|
||||
|
||||
If you write `let result = some_int_var + 1` and `some_int_var` is not annotated `: Int`, the codegen may emit `el_str_concat` instead of integer addition. Fix by adding `: Int` to the variable declaration.
|
||||
|
||||
### `==` Operator Dispatch is Also Heuristic
|
||||
|
||||
Similarly, `==` dispatches between `str_eq(a, b)` (string comparison) and `(a == b)` (integer comparison) based on operand types. The codegen tracks Int-typed names in `__int_names`. Two `Ident` operands where both are known Int-typed use `==`; all other Ident-Ident comparisons use `str_eq`.
|
||||
|
||||
This means comparing two integer variables that were not annotated `: Int` can silently produce `str_eq` on what are actually integer values — and `str_eq` treats them as `const char*` pointers, producing incorrect results or segfaults.
|
||||
|
||||
**Rule:** always annotate variables `: Int` when they will participate in `==` comparisons or `+` arithmetic.
|
||||
|
||||
### Capability Kind Enforcement
|
||||
|
||||
The codegen classifies programs into three capability tiers based on top-level declarations:
|
||||
- `cgi` block present → full capability (all primitives allowed)
|
||||
- `service` block present → restricted (no `llm_call_agentic`, `llm_register_tool`, `dharma_emit`, `dharma_field`)
|
||||
- Neither → `utility` (no DHARMA, no LLM)
|
||||
|
||||
Violations are collected during codegen and emitted as `#error` directives at the bottom of the generated C. The downstream `cc` step then fails with a clear message naming the forbidden call.
|
||||
|
||||
### The `__no_block_expr` Parse Guard
|
||||
|
||||
When parsing the condition of `if`, `while`, `for`, and `match`, the parser sets `state_set("__no_block_expr", "1")`. This prevents `parse_primary` from treating a `{` as the start of a Map literal — instead it returns `{ "expr": "Nil" }` and the caller sees the `{` and treats it as the block delimiter.
|
||||
|
||||
Without this guard, `if a || b { ... }` would recurse into `parse_expr` for `b`, hit `{`, try to parse it as a Map literal, fail to find string keys, loop in error-recovery mode, and hang.
|
||||
|
||||
### Codegen Streams Output via `println`
|
||||
|
||||
The codegen does not build the output as a string — it calls `println()` for each line as it is emitted. The `compile()` / `compile_js()` / `codegen()` functions return `""`. Output goes to stdout.
|
||||
|
||||
This design avoids O(n²) string concatenation for large programs. It also means you cannot capture the compiler's output in a variable within El itself — you must redirect stdout at the OS level (`elc source.el > output.c`).
|
||||
|
||||
When writing to a file, `elc` detects the output path argument, redirects C's `stdout` to the file (via `freopen` in the runtime), and the `println` calls go there instead.
|
||||
BIN
Binary file not shown.
Vendored
BIN
Binary file not shown.
@@ -2939,8 +2939,13 @@ static int looks_like_string(el_val_t v) {
|
||||
const unsigned char* s = (const unsigned char*)p;
|
||||
for (int i = 0; i < 16; i++) {
|
||||
unsigned char c = s[i];
|
||||
if (c == '\0') return i > 0; /* terminated string */
|
||||
if (c < 0x09 || (c > 0x0d && c < 0x20) || c >= 0x7f) return 0;
|
||||
if (c == '\0') return 1; /* terminated string (empty string is still a valid string) */
|
||||
/* Reject C0 control chars (non-whitespace), allow UTF-8 high bytes.
|
||||
* 0x09-0x0d = tab/newline/cr/vt/ff (whitespace, OK)
|
||||
* 0x20-0x7e = printable ASCII (OK)
|
||||
* 0x7f = DEL (reject)
|
||||
* 0x80-0xff = UTF-8 continuation/lead bytes (OK for multi-byte chars) */
|
||||
if (c < 0x09 || (c > 0x0d && c < 0x20) || c == 0x7f) return 0;
|
||||
}
|
||||
return 1; /* 16+ printable bytes — call it a string */
|
||||
}
|
||||
|
||||
@@ -23,26 +23,26 @@
|
||||
fn js_escape(s: String) -> String {
|
||||
let chars: [String] = native_string_chars(s)
|
||||
let total: Int = native_list_len(chars)
|
||||
let out = ""
|
||||
let parts: [String] = native_list_empty()
|
||||
let i = 0
|
||||
while i < total {
|
||||
let ch: String = native_list_get(chars, i)
|
||||
if ch == "\"" {
|
||||
let out = out + "\\\""
|
||||
let parts = native_list_append(parts, "\\\"")
|
||||
} else {
|
||||
if ch == "\\" {
|
||||
let out = out + "\\\\"
|
||||
let parts = native_list_append(parts, "\\\\")
|
||||
} else {
|
||||
if ch == "\n" {
|
||||
let out = out + "\\n"
|
||||
let parts = native_list_append(parts, "\\n")
|
||||
} else {
|
||||
if ch == "\r" {
|
||||
let out = out + "\\r"
|
||||
let parts = native_list_append(parts, "\\r")
|
||||
} else {
|
||||
if ch == "\t" {
|
||||
let out = out + "\\t"
|
||||
let parts = native_list_append(parts, "\\t")
|
||||
} else {
|
||||
let out = out + ch
|
||||
let parts = native_list_append(parts, ch)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -50,7 +50,7 @@ fn js_escape(s: String) -> String {
|
||||
}
|
||||
let i = i + 1
|
||||
}
|
||||
out
|
||||
str_join(parts, "")
|
||||
}
|
||||
|
||||
fn js_str_lit(s: String) -> String {
|
||||
@@ -365,17 +365,15 @@ fn js_cg_expr(expr: Map<String, Any>) -> String {
|
||||
let arity: Int = native_list_len(args)
|
||||
let func_kind: String = func["expr"]
|
||||
|
||||
let args_c = ""
|
||||
let args_parts: [String] = native_list_empty()
|
||||
let i = 0
|
||||
while i < arity {
|
||||
let arg = native_list_get(args, i)
|
||||
let arg_c: String = js_cg_expr(arg)
|
||||
if i > 0 {
|
||||
let args_c = args_c + ", "
|
||||
}
|
||||
let args_c = args_c + arg_c
|
||||
let args_parts = native_list_append(args_parts, arg_c)
|
||||
let i = i + 1
|
||||
}
|
||||
let args_c: String = str_join(args_parts, ", ")
|
||||
|
||||
if func_kind == "Ident" {
|
||||
let fn_name: String = func["name"]
|
||||
@@ -426,38 +424,32 @@ fn js_cg_expr(expr: Map<String, Any>) -> String {
|
||||
let elems = expr["elems"]
|
||||
let n: Int = native_list_len(elems)
|
||||
if n == 0 { return "[]" }
|
||||
let items = ""
|
||||
let items_parts: [String] = native_list_empty()
|
||||
let i = 0
|
||||
while i < n {
|
||||
let elem = native_list_get(elems, i)
|
||||
let elem_c: String = js_cg_expr(elem)
|
||||
if i > 0 {
|
||||
let items = items + ", "
|
||||
}
|
||||
let items = items + elem_c
|
||||
let items_parts = native_list_append(items_parts, elem_c)
|
||||
let i = i + 1
|
||||
}
|
||||
return "[" + items + "]"
|
||||
return "[" + str_join(items_parts, ", ") + "]"
|
||||
}
|
||||
|
||||
if kind == "Map" {
|
||||
let pairs = expr["pairs"]
|
||||
let n: Int = native_list_len(pairs)
|
||||
if n == 0 { return "{}" }
|
||||
let items = ""
|
||||
let items_parts: [String] = native_list_empty()
|
||||
let i = 0
|
||||
while i < n {
|
||||
let pair = native_list_get(pairs, i)
|
||||
let key: String = pair["key"]
|
||||
let val = pair["value"]
|
||||
let val_c: String = js_cg_expr(val)
|
||||
if i > 0 {
|
||||
let items = items + ", "
|
||||
}
|
||||
let items = items + js_str_lit(key) + ": " + val_c
|
||||
let items_parts = native_list_append(items_parts, js_str_lit(key) + ": " + val_c)
|
||||
let i = i + 1
|
||||
}
|
||||
return "{" + items + "}"
|
||||
return "{" + str_join(items_parts, ", ") + "}"
|
||||
}
|
||||
|
||||
if kind == "Try" {
|
||||
@@ -505,7 +497,8 @@ fn js_cg_match(expr: Map<String, Any>) -> String {
|
||||
let subj_c: String = js_cg_expr(subject)
|
||||
let id: String = js_next_match_id()
|
||||
let subj_var: String = "_match_subj_" + id
|
||||
let out: String = "((" + subj_var + ") => { "
|
||||
let parts: [String] = native_list_empty()
|
||||
let parts = native_list_append(parts, "((" + subj_var + ") => { ")
|
||||
let n: Int = native_list_len(arms)
|
||||
let i = 0
|
||||
while i < n {
|
||||
@@ -515,28 +508,28 @@ fn js_cg_match(expr: Map<String, Any>) -> String {
|
||||
let pkind: String = pat["pattern"]
|
||||
let body_c: String = js_cg_expr(body)
|
||||
if str_eq(pkind, "Wildcard") {
|
||||
let out = out + "return (" + body_c + "); "
|
||||
let parts = native_list_append(parts, "return (" + body_c + "); ")
|
||||
} else {
|
||||
if str_eq(pkind, "Binding") {
|
||||
let bname: String = pat["name"]
|
||||
let out = out + "{ const " + bname + " = " + subj_var + "; return (" + body_c + "); } "
|
||||
let parts = native_list_append(parts, "{ const " + bname + " = " + subj_var + "; return (" + body_c + "); } ")
|
||||
} else {
|
||||
if str_eq(pkind, "LitInt") {
|
||||
let v: String = pat["value"]
|
||||
let out = out + "if (" + subj_var + " === " + v + ") return (" + body_c + "); "
|
||||
let parts = native_list_append(parts, "if (" + subj_var + " === " + v + ") return (" + body_c + "); ")
|
||||
} else {
|
||||
if str_eq(pkind, "LitStr") {
|
||||
let v: String = pat["value"]
|
||||
let out = out + "if (str_eq(" + subj_var + ", " + js_str_lit(v) + ")) return (" + body_c + "); "
|
||||
let parts = native_list_append(parts, "if (str_eq(" + subj_var + ", " + js_str_lit(v) + ")) return (" + body_c + "); ")
|
||||
} else {
|
||||
if str_eq(pkind, "LitBool") {
|
||||
let v: String = pat["value"]
|
||||
let bv = "false"
|
||||
if str_eq(v, "true") { let bv = "true" }
|
||||
let out = out + "if (" + subj_var + " === " + bv + ") return (" + body_c + "); "
|
||||
let parts = native_list_append(parts, "if (" + subj_var + " === " + bv + ") return (" + body_c + "); ")
|
||||
} else {
|
||||
// unknown pattern → wildcard
|
||||
let out = out + "return (" + body_c + "); "
|
||||
let parts = native_list_append(parts, "return (" + body_c + "); ")
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -544,8 +537,8 @@ fn js_cg_match(expr: Map<String, Any>) -> String {
|
||||
}
|
||||
let i = i + 1
|
||||
}
|
||||
let out = out + "return null; })(" + subj_c + ")"
|
||||
out
|
||||
let parts = native_list_append(parts, "return null; })(" + subj_c + ")")
|
||||
str_join(parts, "")
|
||||
}
|
||||
|
||||
// ── Variable scope tracking ───────────────────────────────────────────────────
|
||||
@@ -696,14 +689,7 @@ fn js_strip_outer_parens(s: String) -> String {
|
||||
let i = i + 1
|
||||
}
|
||||
if balanced {
|
||||
let inner = ""
|
||||
let j = 1
|
||||
while j < n - 1 {
|
||||
let ch: String = native_list_get(chars, j)
|
||||
let inner = inner + ch
|
||||
let j = j + 1
|
||||
}
|
||||
return inner
|
||||
return str_slice(s, 1, n - 1)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -759,18 +745,15 @@ fn js_cg_stmts(stmts: [Map<String, Any>], indent: String, declared: [String]) ->
|
||||
fn js_params_str(params: [Map<String, Any>]) -> String {
|
||||
let n: Int = native_list_len(params)
|
||||
if n == 0 { return "" }
|
||||
let out = ""
|
||||
let parts: [String] = native_list_empty()
|
||||
let i = 0
|
||||
while i < n {
|
||||
let param = native_list_get(params, i)
|
||||
let name: String = param["name"]
|
||||
if i > 0 {
|
||||
let out = out + ", "
|
||||
}
|
||||
let out = out + name
|
||||
let parts = native_list_append(parts, name)
|
||||
let i = i + 1
|
||||
}
|
||||
out
|
||||
str_join(parts, ", ")
|
||||
}
|
||||
|
||||
// Same implicit-return transform as the C backend.
|
||||
|
||||
+200
-98
@@ -1,4 +1,4 @@
|
||||
// codegen.el — El compiler C source code generator
|
||||
// codegen.el - El compiler C source code generator
|
||||
//
|
||||
// Input: list of AST statement maps (from parser.el)
|
||||
// Output: C source printed to stdout (streamed, one line at a time)
|
||||
@@ -7,37 +7,90 @@
|
||||
// Functions map directly to C functions; top-level statements become main().
|
||||
//
|
||||
// Entry point: fn codegen(stmts: [Map<String, Any>], source: String) -> String
|
||||
// Returns "" — output goes to stdout via println().
|
||||
// Returns "" - output goes to stdout via println().
|
||||
//
|
||||
// Streaming output avoids O(n²) string concatenation: each emitted line is
|
||||
// Streaming output avoids O(n-) string concatenation: each emitted line is
|
||||
// printed immediately rather than appended to a growing string.
|
||||
|
||||
// ── String helpers ────────────────────────────────────────────────────────────
|
||||
// -- String helpers ------------------------------------------------------------
|
||||
|
||||
// Escape a C string literal (double-quotes and backslashes).
|
||||
// Hex-encode a single nibble (0-15) as a lowercase hex character.
|
||||
fn nibble_to_hex(n: Int) -> String {
|
||||
str_char_at("0123456789abcdef", n)
|
||||
}
|
||||
|
||||
// Encode a byte value (0-255) as a two-character hex string.
|
||||
fn byte_to_hex2(b: Int) -> String {
|
||||
let hi: Int = (b / 16)
|
||||
let lo: Int = (b - hi * 16)
|
||||
nibble_to_hex(hi) + nibble_to_hex(lo)
|
||||
}
|
||||
|
||||
// Return true if the byte value is a C hex digit (0-9, a-f, A-F).
|
||||
// Used to determine whether a \xNN escape needs a string-literal split
|
||||
// to prevent the C preprocessor from greedily consuming following hex chars.
|
||||
fn is_hex_digit_byte(b: Int) -> Bool {
|
||||
if b >= 48 { if b <= 57 { return true } } // 0-9
|
||||
if b >= 65 { if b <= 70 { return true } } // A-F
|
||||
if b >= 97 { if b <= 102 { return true } } // a-f
|
||||
false
|
||||
}
|
||||
|
||||
fn c_escape(s: String) -> String {
|
||||
let chars: [String] = native_string_chars(s)
|
||||
let total: Int = native_list_len(chars)
|
||||
// Use index-based byte scanning via str_char_code(s, i) and str_char_at(s, i).
|
||||
// This avoids native_string_chars + str_join, which corrupts high-byte (>= 0x80)
|
||||
// characters because list_join's looks_like_string heuristic rejects strings
|
||||
// whose first byte is >= 0x7F and emits them as decimal pointer values instead.
|
||||
//
|
||||
// IMPORTANT: after a \xNN hex escape, if the next byte is a hex digit
|
||||
// (0-9, a-f, A-F), we emit `""` to split the C string literal so the C
|
||||
// compiler does not greedily read extra hex digits as part of the escape.
|
||||
// E.g. "\xad" followed by "bamos" must become "\xad" "bamos" because 'b'
|
||||
// is a hex digit and C would otherwise read "\xadb" (= 0xADB, out of range).
|
||||
let total: Int = str_len(s)
|
||||
let parts: [String] = native_list_empty()
|
||||
let i = 0
|
||||
let i: Int = 0
|
||||
let prev_was_hex_escape: Bool = false
|
||||
while i < total {
|
||||
let ch: String = native_list_get(chars, i)
|
||||
if ch == "\"" {
|
||||
let bval: Int = str_char_code(s, i)
|
||||
// If the previous token was a \xNN escape and the current byte is a
|
||||
// hex digit, insert an empty string literal ("") to break the escape.
|
||||
if prev_was_hex_escape {
|
||||
if is_hex_digit_byte(bval) {
|
||||
let parts = native_list_append(parts, "\"\"")
|
||||
}
|
||||
}
|
||||
let prev_was_hex_escape = false
|
||||
if bval == 34 {
|
||||
// 34 = '"'
|
||||
let parts = native_list_append(parts, "\\\"")
|
||||
} else {
|
||||
if ch == "\\" {
|
||||
if bval == 92 {
|
||||
// 92 = '\\'
|
||||
let parts = native_list_append(parts, "\\\\")
|
||||
} else {
|
||||
if ch == "\n" {
|
||||
if bval == 10 {
|
||||
// 10 = '\n'
|
||||
let parts = native_list_append(parts, "\\n")
|
||||
} else {
|
||||
if ch == "\r" {
|
||||
if bval == 13 {
|
||||
// 13 = '\r'
|
||||
let parts = native_list_append(parts, "\\r")
|
||||
} else {
|
||||
if ch == "\t" {
|
||||
if bval == 9 {
|
||||
// 9 = '\t'
|
||||
let parts = native_list_append(parts, "\\t")
|
||||
} else {
|
||||
let parts = native_list_append(parts, ch)
|
||||
if bval >= 128 {
|
||||
// Escape non-ASCII bytes (>= 0x80) as \xNN so
|
||||
// Clang does not misinterpret multi-byte UTF-8
|
||||
// sequences in C string literals.
|
||||
let parts = native_list_append(parts, "\\x" + byte_to_hex2(bval))
|
||||
let prev_was_hex_escape = true
|
||||
} else {
|
||||
let parts = native_list_append(parts, str_char_at(s, i))
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -52,7 +105,7 @@ fn c_str_lit(s: String) -> String {
|
||||
"\"" + c_escape(s) + "\""
|
||||
}
|
||||
|
||||
// ── Type mapping ──────────────────────────────────────────────────────────────
|
||||
// -- Type mapping --------------------------------------------------------------
|
||||
|
||||
fn el_type_to_c(type_str: String) -> String {
|
||||
if type_str == "String" { return "const char*" }
|
||||
@@ -64,7 +117,7 @@ fn el_type_to_c(type_str: String) -> String {
|
||||
"void*"
|
||||
}
|
||||
|
||||
// ── Code emission ─────────────────────────────────────────────────────────────
|
||||
// -- Code emission -------------------------------------------------------------
|
||||
//
|
||||
// emit_line/emit_blank stream output directly via println.
|
||||
// This avoids building a large string in memory.
|
||||
@@ -77,7 +130,7 @@ fn emit_blank() -> Void {
|
||||
println("")
|
||||
}
|
||||
|
||||
// ── Operator helpers ──────────────────────────────────────────────────────────
|
||||
// -- Operator helpers ----------------------------------------------------------
|
||||
|
||||
fn binop_to_c(op: String) -> String {
|
||||
if op == "Plus" { return "+" }
|
||||
@@ -95,11 +148,11 @@ fn binop_to_c(op: String) -> String {
|
||||
op
|
||||
}
|
||||
|
||||
// ── Expression codegen ────────────────────────────────────────────────────────
|
||||
// -- Expression codegen --------------------------------------------------------
|
||||
//
|
||||
// cg_expr returns a C expression string (not a statement).
|
||||
|
||||
// duration_unit_nanos — multiplier from a postfix-literal unit name to
|
||||
// duration_unit_nanos - multiplier from a postfix-literal unit name to
|
||||
// nanoseconds. Singular and plural forms collapse to the same multiplier;
|
||||
// the parser already restricted `unit` to the set is_duration_unit accepts.
|
||||
// Returns the multiplier as a decimal string suitable for splicing into
|
||||
@@ -130,7 +183,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
return v
|
||||
}
|
||||
|
||||
// DurationLit — postfix-literal time value (e.g. 30.seconds, 1.hour).
|
||||
// DurationLit - postfix-literal time value (e.g. 30.seconds, 1.hour).
|
||||
// Lowered to a literal int64 nanosecond count, wrapped in the runtime
|
||||
// entry point so the intent is explicit at the C level. The arithmetic
|
||||
// is fully constant-folded by any optimising C compiler.
|
||||
@@ -144,7 +197,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
if kind == "Float" {
|
||||
// Wrap Float literals in el_from_float() so the bit pattern is
|
||||
// preserved through the el_val_t (int64) slot. Without this,
|
||||
// implicit double→int64 conversion in C truncates `0.8` to `0`
|
||||
// implicit double->int64 conversion in C truncates `0.8` to `0`
|
||||
// when passed to a builtin that expects el_val_t.
|
||||
let v: String = expr["value"]
|
||||
return "el_from_float(" + v + ")"
|
||||
@@ -191,12 +244,12 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
let left_kind: String = left["expr"]
|
||||
let right_kind: String = right["expr"]
|
||||
|
||||
// ── String/equality fast-path: skip O(N²) temporal traversals ────────
|
||||
// -- String/equality fast-path: skip O(N-) temporal traversals --------
|
||||
// The 10 temporal predicates below each recurse into the left subtree:
|
||||
// O(depth) state_get calls per predicate, O(N²) total for a chain of N
|
||||
// O(depth) state_get calls per predicate, O(N-) total for a chain of N
|
||||
// string-concat BinOps (e.g. the 70-100-part HTML chains in soul.el).
|
||||
// When either operand is a bare Str literal the result is always concat
|
||||
// or str_eq — no temporal dispatch is possible. Exit immediately.
|
||||
// or str_eq - no temporal dispatch is possible. Exit immediately.
|
||||
if str_eq(op, "Plus") {
|
||||
if str_eq(left_kind, "Str") { return "el_str_concat(" + left_c + ", " + right_c + ")" }
|
||||
if str_eq(right_kind, "Str") { return "el_str_concat(" + left_c + ", " + right_c + ")" }
|
||||
@@ -210,7 +263,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
if str_eq(right_kind, "Str") { return "!str_eq(" + left_c + ", " + right_c + ")" }
|
||||
}
|
||||
|
||||
// ── Temporal-type dispatch (Instant + Duration first-class) ────────
|
||||
// -- Temporal-type dispatch (Instant + Duration first-class) --------
|
||||
// Run BEFORE the int / string / generic paths so typed temporal
|
||||
// operands route through the runtime wrappers and invalid combos
|
||||
// become #error directives rather than silently falling through to
|
||||
@@ -396,7 +449,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
if right_is_dur { return "el_duration_ne(" + left_c + ", " + right_c + ")" }
|
||||
}
|
||||
}
|
||||
// Fall through — let the existing path handle anything we
|
||||
// Fall through - let the existing path handle anything we
|
||||
// didn't explicitly cover (typically string-concat with a
|
||||
// typed temporal value, e.g. for debug prints, which works
|
||||
// because both share the int64 slot).
|
||||
@@ -415,7 +468,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
// builtin, or BinOp arithmetic over Ints) participates in
|
||||
// arithmetic, not string concat. Recursion into BinOp lets
|
||||
// `a + b + c` (chained Int adds) and `acc * 16 + d` route to
|
||||
// arithmetic instead of falling to el_str_concat — both sides
|
||||
// arithmetic instead of falling to el_str_concat - both sides
|
||||
// are Int so the outer `+` is too.
|
||||
if is_int_expr(left) {
|
||||
if is_int_expr(right) {
|
||||
@@ -436,7 +489,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
return "(" + left_c + " " + op_c + " " + right_c + ")"
|
||||
}
|
||||
// Otherwise: BinOp(+) with a Call/Ident side without int-typed
|
||||
// evidence — fall back to string concat (the historical default).
|
||||
// evidence - fall back to string concat (the historical default).
|
||||
if left_kind == "Call" {
|
||||
return "el_str_concat(" + left_c + ", " + right_c + ")"
|
||||
}
|
||||
@@ -468,7 +521,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
// identifiers tracked in __int_names (typed Int via `let x: Int = ...`).
|
||||
// Without the int-name check, `seen == idx` between two Int locals
|
||||
// miscompiles to str_eq(seen, idx), strcmp'ing what are integer values
|
||||
// dressed as char* — segfault on the first non-printable byte.
|
||||
// dressed as char* - segfault on the first non-printable byte.
|
||||
if op == "EqEq" {
|
||||
if left_kind == "Int" {
|
||||
return "(" + left_c + " == " + right_c + ")"
|
||||
@@ -602,17 +655,17 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
// violations to be emitted as #error directives at the
|
||||
// top of the generated C, so cc fails with a clear msg.
|
||||
cap_check_call(fn_name)
|
||||
// Arity check against the builtin table — refuse, with a clear
|
||||
// Arity check against the builtin table - refuse, with a clear
|
||||
// El-source message, when a known builtin gets the wrong arg
|
||||
// count (e.g. `http_serve(port)` instead of `http_serve(port,
|
||||
// handler)`). User-defined fns and variadic builtins pass
|
||||
// through (builtin_arity returns -1).
|
||||
arity_check_call(fn_name, arity)
|
||||
// sleep(Duration) — Phase 1 of the typed-time work. When the
|
||||
// sleep(Duration) - Phase 1 of the typed-time work. When the
|
||||
// single arg is provably a Duration we lower to el_sleep_duration
|
||||
// so the runtime sees nanos directly. Existing sleep() callers
|
||||
// that pass an Int still emit `sleep(<int>)`, which falls through
|
||||
// to the no-such-symbol path — those call sites must migrate to
|
||||
// to the no-such-symbol path - those call sites must migrate to
|
||||
// a typed Duration. Acceptable: the spec marks them out for an
|
||||
// audit pass during Phase 1.
|
||||
if str_eq(fn_name, "sleep") {
|
||||
@@ -623,6 +676,20 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
}
|
||||
}
|
||||
}
|
||||
// el_from_float takes a raw C double - do not wrap the float
|
||||
// argument in el_from_float() again. Without this, the float
|
||||
// literal codegen (which wraps every Float in el_from_float())
|
||||
// produces el_from_float(el_from_float(0.7)) - double-encoded.
|
||||
if str_eq(fn_name, "el_from_float") {
|
||||
if arity == 1 {
|
||||
let only_arg = native_list_get(args, 0)
|
||||
let arg_kind: String = only_arg["expr"]
|
||||
if str_eq(arg_kind, "Float") {
|
||||
let v: String = only_arg["value"]
|
||||
return "el_from_float(" + v + ")"
|
||||
}
|
||||
}
|
||||
}
|
||||
return fn_name + "(" + args_c + ")"
|
||||
}
|
||||
|
||||
@@ -656,8 +723,8 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
// El programs use `t["field"]` for map access and `arr[i]` for
|
||||
// list access. The parser emits the same Index node for both.
|
||||
// Dispatch at codegen time on the index expression kind: string-
|
||||
// literal index → map field access (`el_get_field`); anything
|
||||
// else → list element access (`el_list_get`).
|
||||
// literal index -> map field access (`el_get_field`); anything
|
||||
// else -> list element access (`el_list_get`).
|
||||
let obj = expr["object"]
|
||||
let idx = expr["index"]
|
||||
let obj_c: String = cg_expr(obj)
|
||||
@@ -691,7 +758,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
let n: Int = native_list_len(pairs)
|
||||
// Empty literal: `el_map_new(0, )` is malformed C (trailing comma in
|
||||
// a varargs call). Emit `el_map_new(0)` directly so empty-map
|
||||
// shadowing inside for/while/if bodies — `let acc: Map = {}` —
|
||||
// shadowing inside for/while/if bodies - `let acc: Map = {}` -
|
||||
// doesn't fail downstream cc with parse errors.
|
||||
if n == 0 { return "el_map_new(0)" }
|
||||
let items_parts: [String] = native_list_empty()
|
||||
@@ -723,7 +790,7 @@ fn cg_expr(expr: Map<String, Any>) -> String {
|
||||
"EL_NULL"
|
||||
}
|
||||
|
||||
// ── Match codegen ─────────────────────────────────────────────────────────────
|
||||
// -- Match codegen -------------------------------------------------------------
|
||||
//
|
||||
// Lower a match expression to a GCC/Clang statement-expression.
|
||||
// A unique label suffix is allocated per match via state_set("__match_counter").
|
||||
@@ -747,7 +814,7 @@ fn cg_match(expr: Map<String, Any>) -> String {
|
||||
let subj_var: String = "_match_subj_" + id
|
||||
let result_var: String = "_match_result_" + id
|
||||
let done_label: String = "_match_done_" + id
|
||||
// Accumulate arm fragments into a list to avoid O(n²) string growth.
|
||||
// Accumulate arm fragments into a list to avoid O(n-) string growth.
|
||||
let parts: [String] = native_list_empty()
|
||||
let parts = native_list_append(parts, "({ el_val_t " + subj_var + " = " + subj_c + "; el_val_t " + result_var + " = 0; ")
|
||||
let n: Int = native_list_len(arms)
|
||||
@@ -781,7 +848,7 @@ fn cg_match(expr: Map<String, Any>) -> String {
|
||||
}
|
||||
let parts = native_list_append(parts, "if (" + subj_var + " == " + bv + ") { " + result_var + " = (" + body_c + "); goto " + done_label + "; } ")
|
||||
} else {
|
||||
// unknown pattern → wildcard
|
||||
// unknown pattern -> wildcard
|
||||
let parts = native_list_append(parts, "{ " + result_var + " = (" + body_c + "); goto " + done_label + "; } ")
|
||||
}
|
||||
}
|
||||
@@ -794,7 +861,7 @@ fn cg_match(expr: Map<String, Any>) -> String {
|
||||
str_join(parts, "")
|
||||
}
|
||||
|
||||
// ── If-as-expression codegen ─────────────────────────────────────────────────
|
||||
// -- If-as-expression codegen -------------------------------------------------
|
||||
//
|
||||
// Lower `if cond { thenBody } else { elseBody }` used in expression position
|
||||
// (e.g. `let x = if a { b } else { c }`) to a GCC/Clang statement-expression
|
||||
@@ -822,7 +889,7 @@ fn next_if_id() -> String {
|
||||
// result var stays at its initial 0.
|
||||
fn cg_if_expr_arm(stmts: [Map<String, Any>], result_var: String) -> String {
|
||||
let n: Int = native_list_len(stmts)
|
||||
// Collect statement fragments into a list to avoid O(n²) string growth.
|
||||
// Collect statement fragments into a list to avoid O(n-) string growth.
|
||||
let parts: [String] = native_list_empty()
|
||||
let i = 0
|
||||
while i < n {
|
||||
@@ -851,7 +918,7 @@ fn cg_if_expr_arm(stmts: [Map<String, Any>], result_var: String) -> String {
|
||||
}
|
||||
} else {
|
||||
if str_eq(sk, "Assign") {
|
||||
// Real reassignment in an expression-position arm —
|
||||
// Real reassignment in an expression-position arm -
|
||||
// emit the store; the arm's "value" stays whatever
|
||||
// result_var was last set to, which is the El
|
||||
// semantics (assignment is a statement, not a value).
|
||||
@@ -889,7 +956,7 @@ fn cg_if_expr(expr: Map<String, Any>) -> String {
|
||||
out
|
||||
}
|
||||
|
||||
// ── Variable scope tracking ───────────────────────────────────────────────────
|
||||
// -- Variable scope tracking ---------------------------------------------------
|
||||
//
|
||||
// El allows `let x = expr` to both declare and reassign x in the same scope.
|
||||
// C doesn't allow redeclaring the same name in the same block.
|
||||
@@ -908,7 +975,7 @@ fn list_contains(lst: [String], s: String) -> Bool {
|
||||
false
|
||||
}
|
||||
|
||||
// ── Statement codegen ─────────────────────────────────────────────────────────
|
||||
// -- Statement codegen ---------------------------------------------------------
|
||||
//
|
||||
// cg_stmt emits C lines via println. declared is a list of already-declared
|
||||
// variable names in the current C scope; returns updated declared list.
|
||||
@@ -957,7 +1024,7 @@ fn cg_stmt(stmt: Map<String, Any>, indent: String, declared: [String]) -> [Strin
|
||||
if str_eq(ltype, "Zone") {
|
||||
add_zone_name(name)
|
||||
}
|
||||
// Inference from RHS — duration literals and known-typed calls
|
||||
// Inference from RHS - duration literals and known-typed calls
|
||||
// propagate even when the let is unannotated.
|
||||
if is_instant_expr(val) {
|
||||
add_instant_name(name)
|
||||
@@ -1012,7 +1079,7 @@ fn cg_stmt(stmt: Map<String, Any>, indent: String, declared: [String]) -> [Strin
|
||||
}
|
||||
|
||||
// Bare reassignment: `name = expr`. Always emits a plain C assignment
|
||||
// (no `el_val_t` prefix) — by construction the parser only produces
|
||||
// (no `el_val_t` prefix) - by construction the parser only produces
|
||||
// Assign for an existing identifier. If the name happens NOT to be in
|
||||
// `declared` for the current C scope (it was let-bound by an enclosing
|
||||
// block) the emit still resolves at C level because the variable lives
|
||||
@@ -1047,7 +1114,7 @@ fn cg_stmt(stmt: Map<String, Any>, indent: String, declared: [String]) -> [Strin
|
||||
let cond_c: String = cg_expr(cond)
|
||||
let cond_c = strip_outer_parens(cond_c)
|
||||
emit_line(indent + "while (" + cond_c + ") {")
|
||||
// Body lives in its own C block — clone so let-bindings inside the
|
||||
// Body lives in its own C block - clone so let-bindings inside the
|
||||
// loop don't leak into the parent's `declared` list (which would make
|
||||
// a sibling scope's `let x` emit assignment on an undeclared name).
|
||||
cg_stmts(body, indent + " ", native_list_clone(declared))
|
||||
@@ -1114,7 +1181,7 @@ fn cg_if_stmt(expr: Map<String, Any>, indent: String, declared: [String]) -> Voi
|
||||
let cond_c: String = cg_expr(cond)
|
||||
let cond_c = strip_outer_parens(cond_c)
|
||||
emit_line(indent + "if (" + cond_c + ") {")
|
||||
// Each branch gets its own clone of `declared` — variables let-bound
|
||||
// Each branch gets its own clone of `declared` - variables let-bound
|
||||
// inside the then/else block live only in that C scope, and must not
|
||||
// leak back to the parent (or to the sibling branch) through shared
|
||||
// list mutation. Cheap shallow copy; the entries (variable name strings)
|
||||
@@ -1166,7 +1233,7 @@ fn cg_stmts(stmts: [Map<String, Any>], indent: String, declared: [String]) -> [S
|
||||
decl
|
||||
}
|
||||
|
||||
// ── Function declaration codegen ───────────────────────────────────────────────
|
||||
// -- Function declaration codegen -----------------------------------------------
|
||||
|
||||
fn param_decl(param: Map<String, Any>, idx: Int) -> String {
|
||||
let name: String = param["name"]
|
||||
@@ -1235,7 +1302,7 @@ fn is_int_name(name: String) -> Bool {
|
||||
|
||||
// Same shape as is_int_name, for Instant- and Duration-typed bindings.
|
||||
// Used by the BinOp/comparison codegen to dispatch arithmetic through the
|
||||
// typed runtime wrappers (el_instant_add_dur, el_duration_lt, …) and to
|
||||
// typed runtime wrappers (el_instant_add_dur, el_duration_lt, -) and to
|
||||
// surface mismatches (Instant + Instant, Duration + Int) as #error
|
||||
// directives at the top of the generated C.
|
||||
fn is_instant_name(name: String) -> Bool {
|
||||
@@ -1297,7 +1364,7 @@ fn is_int_call(call_expr: Map<String, Any>) -> Bool {
|
||||
}
|
||||
|
||||
// Builtins that return an Instant. Used by is_instant_expr and the BinOp
|
||||
// dispatch — `now() + 5.seconds` types as Instant only because we can see
|
||||
// dispatch - `now() + 5.seconds` types as Instant only because we can see
|
||||
// that now() is an Instant-returning Call.
|
||||
fn is_instant_call(call_expr: Map<String, Any>) -> Bool {
|
||||
let func = call_expr["func"]
|
||||
@@ -1333,7 +1400,7 @@ fn is_duration_call(call_expr: Map<String, Any>) -> Bool {
|
||||
return false
|
||||
}
|
||||
|
||||
// Phase 1.5 — Calendar / CalendarTime / Rhythm / LocalDate / LocalTime /
|
||||
// Phase 1.5 - Calendar / CalendarTime / Rhythm / LocalDate / LocalTime /
|
||||
// LocalDateTime / Zone are first-class boxed types. Each has its own name
|
||||
// set in process state, populated from typed `let` bindings and parameter
|
||||
// annotations. The BinOp dispatcher consults these to forbid mismatched
|
||||
@@ -1521,7 +1588,7 @@ fn is_zone_expr(expr: Map<String, Any>) -> Bool {
|
||||
// Recursive type predicates for Instant / Duration. Mirror is_int_expr.
|
||||
// is_instant_expr / is_duration_expr return true only when the expression
|
||||
// is provably of that type at codegen time. Anything ambiguous returns
|
||||
// false — the BinOp dispatcher then leaves the expression on the
|
||||
// false - the BinOp dispatcher then leaves the expression on the
|
||||
// untyped-int path, which is the safest fallback because at the runtime
|
||||
// level all three types share the int64 slot.
|
||||
fn is_instant_expr(expr: Map<String, Any>) -> Bool {
|
||||
@@ -1536,8 +1603,8 @@ fn is_instant_expr(expr: Map<String, Any>) -> Bool {
|
||||
if str_eq(k, "BinOp") {
|
||||
let op: String = expr["op"]
|
||||
if str_eq(op, "Plus") {
|
||||
// Instant + Duration → Instant
|
||||
// Duration + Instant → Instant
|
||||
// Instant + Duration -> Instant
|
||||
// Duration + Instant -> Instant
|
||||
if is_instant_expr(expr["left"]) {
|
||||
if is_duration_expr(expr["right"]) { return true }
|
||||
}
|
||||
@@ -1547,7 +1614,7 @@ fn is_instant_expr(expr: Map<String, Any>) -> Bool {
|
||||
return false
|
||||
}
|
||||
if str_eq(op, "Minus") {
|
||||
// Instant - Duration → Instant
|
||||
// Instant - Duration -> Instant
|
||||
if is_instant_expr(expr["left"]) {
|
||||
if is_duration_expr(expr["right"]) { return true }
|
||||
}
|
||||
@@ -1574,15 +1641,15 @@ fn is_duration_expr(expr: Map<String, Any>) -> Bool {
|
||||
if str_eq(k, "BinOp") {
|
||||
let op: String = expr["op"]
|
||||
if str_eq(op, "Plus") {
|
||||
// Duration + Duration → Duration
|
||||
// Duration + Duration -> Duration
|
||||
if is_duration_expr(expr["left"]) {
|
||||
if is_duration_expr(expr["right"]) { return true }
|
||||
}
|
||||
return false
|
||||
}
|
||||
if str_eq(op, "Minus") {
|
||||
// Duration - Duration → Duration
|
||||
// Instant - Instant → Duration (caught here, not in is_instant_expr)
|
||||
// Duration - Duration -> Duration
|
||||
// Instant - Instant -> Duration (caught here, not in is_instant_expr)
|
||||
if is_duration_expr(expr["left"]) {
|
||||
if is_duration_expr(expr["right"]) { return true }
|
||||
}
|
||||
@@ -1592,8 +1659,8 @@ fn is_duration_expr(expr: Map<String, Any>) -> Bool {
|
||||
return false
|
||||
}
|
||||
if str_eq(op, "Star") {
|
||||
// Duration * Int → Duration
|
||||
// Int * Duration → Duration
|
||||
// Duration * Int -> Duration
|
||||
// Int * Duration -> Duration
|
||||
if is_duration_expr(expr["left"]) {
|
||||
if is_int_expr(expr["right"]) { return true }
|
||||
}
|
||||
@@ -1603,7 +1670,7 @@ fn is_duration_expr(expr: Map<String, Any>) -> Bool {
|
||||
return false
|
||||
}
|
||||
if str_eq(op, "Slash") {
|
||||
// Duration / Int → Duration
|
||||
// Duration / Int -> Duration
|
||||
if is_duration_expr(expr["left"]) {
|
||||
if is_int_expr(expr["right"]) { return true }
|
||||
}
|
||||
@@ -1634,13 +1701,13 @@ fn time_record_violation(kind: String, detail: String) -> Bool {
|
||||
// the outer dispatch only checks the immediate kind, not the inner.
|
||||
//
|
||||
// Rules:
|
||||
// Int literal → Int
|
||||
// Ident in __int_names → Int
|
||||
// Call to known-Int builtin → Int
|
||||
// Neg of Int → Int
|
||||
// BinOp arithmetic of two Ints → Int (Plus, Minus, Star, Slash, Percent)
|
||||
// BinOp comparison/logical → Int (yields 0/1; safe to treat as Int)
|
||||
// anything else → not provably Int
|
||||
// Int literal -> Int
|
||||
// Ident in __int_names -> Int
|
||||
// Call to known-Int builtin -> Int
|
||||
// Neg of Int -> Int
|
||||
// BinOp arithmetic of two Ints -> Int (Plus, Minus, Star, Slash, Percent)
|
||||
// BinOp comparison/logical -> Int (yields 0/1; safe to treat as Int)
|
||||
// anything else -> not provably Int
|
||||
fn is_int_expr(expr: Map<String, Any>) -> Bool {
|
||||
let k: String = expr["expr"]
|
||||
if str_eq(k, "Int") { return true }
|
||||
@@ -1659,7 +1726,7 @@ fn is_int_expr(expr: Map<String, Any>) -> Bool {
|
||||
}
|
||||
if str_eq(k, "BinOp") {
|
||||
let op: String = expr["op"]
|
||||
// Comparisons and logicals always yield 0/1 — safe Int.
|
||||
// Comparisons and logicals always yield 0/1 - safe Int.
|
||||
if str_eq(op, "EqEq") { return true }
|
||||
if str_eq(op, "NotEq") { return true }
|
||||
if str_eq(op, "Lt") { return true }
|
||||
@@ -1668,7 +1735,7 @@ fn is_int_expr(expr: Map<String, Any>) -> Bool {
|
||||
if str_eq(op, "GtEq") { return true }
|
||||
if str_eq(op, "And") { return true }
|
||||
if str_eq(op, "Or") { return true }
|
||||
// Arithmetic propagates: Int op Int → Int.
|
||||
// Arithmetic propagates: Int op Int -> Int.
|
||||
if str_eq(op, "Plus") {
|
||||
if is_int_expr(expr["left"]) {
|
||||
if is_int_expr(expr["right"]) { return true }
|
||||
@@ -1698,7 +1765,7 @@ fn is_int_expr(expr: Map<String, Any>) -> Bool {
|
||||
return false
|
||||
}
|
||||
|
||||
// ── Capability-kind enforcement ──────────────────────────────────────────────
|
||||
// -- Capability-kind enforcement ----------------------------------------------
|
||||
//
|
||||
// A program's top-level block (cgi / service / none) determines which
|
||||
// runtime primitives it may call. The compiler records violations in
|
||||
@@ -1707,11 +1774,11 @@ fn is_int_expr(expr: Map<String, Any>) -> Bool {
|
||||
// downstream cc step fails with a clear message.
|
||||
//
|
||||
// Capability tiers:
|
||||
// "cgi" — full self-formation. All primitives.
|
||||
// "service" — bounded. Cannot call self-formation primitives:
|
||||
// "cgi" - full self-formation. All primitives.
|
||||
// "service" - bounded. Cannot call self-formation primitives:
|
||||
// llm_call_agentic, llm_register_tool, dharma_emit,
|
||||
// dharma_field. Single-turn LLM calls are allowed.
|
||||
// "utility" — default. No DHARMA, no LLM. Pure compute + I/O.
|
||||
// "utility" - default. No DHARMA, no LLM. Pure compute + I/O.
|
||||
//
|
||||
// The compiler-level rule is structural: the binary either CAN or CANNOT
|
||||
// emit the call. There is no runtime check, no opt-in, no override.
|
||||
@@ -1726,7 +1793,7 @@ fn cap_record_violation(kind: String, fn_name: String) -> Bool {
|
||||
return true
|
||||
}
|
||||
|
||||
// Self-formation primitives — the cut between CGI and service. A program
|
||||
// Self-formation primitives - the cut between CGI and service. A program
|
||||
// that emits these calls IS structurally a CGI; we forbid them everywhere
|
||||
// else.
|
||||
fn is_self_formation_call(fn_name: String) -> Bool {
|
||||
@@ -1737,7 +1804,7 @@ fn is_self_formation_call(fn_name: String) -> Bool {
|
||||
return false
|
||||
}
|
||||
|
||||
// Any DHARMA primitive — utilities have zero network presence.
|
||||
// Any DHARMA primitive - utilities have zero network presence.
|
||||
fn is_dharma_call(fn_name: String) -> Bool {
|
||||
if str_eq(fn_name, "dharma_connect") { return true }
|
||||
if str_eq(fn_name, "dharma_send") { return true }
|
||||
@@ -1750,7 +1817,7 @@ fn is_dharma_call(fn_name: String) -> Bool {
|
||||
return false
|
||||
}
|
||||
|
||||
// Any LLM primitive — utilities have no LLM access at all.
|
||||
// Any LLM primitive - utilities have no LLM access at all.
|
||||
fn is_llm_call(fn_name: String) -> Bool {
|
||||
if str_eq(fn_name, "llm_call") { return true }
|
||||
if str_eq(fn_name, "llm_call_system") { return true }
|
||||
@@ -1800,14 +1867,14 @@ fn emit_cap_violations() -> Void {
|
||||
if colon > 0 {
|
||||
let kind: String = str_slice(entry, 0, colon)
|
||||
let fn_name: String = str_slice(entry, colon + 1, str_len(entry))
|
||||
emit_line("#error \"capability violation: '" + kind + "' programs may not call '" + fn_name + "' (self-formation primitive — only 'cgi' programs may use it)\"")
|
||||
emit_line("#error \"capability violation: '" + kind + "' programs may not call '" + fn_name + "' (self-formation primitive - only 'cgi' programs may use it)\"")
|
||||
}
|
||||
let i = i + next_comma + 1
|
||||
}
|
||||
}
|
||||
|
||||
// Surface temporal-type violations as #error directives. The cg_expr BinOp
|
||||
// dispatcher records each violation (Instant + Instant, Duration + Int, …)
|
||||
// dispatcher records each violation (Instant + Instant, Duration + Int, -)
|
||||
// as a CSV entry "kind:detail" via time_record_violation. Each entry maps
|
||||
// to a single #error so downstream cc fails the build with a clear El-
|
||||
// source-level message before the bogus C even links.
|
||||
@@ -1830,7 +1897,7 @@ fn emit_time_violations() -> Void {
|
||||
}
|
||||
}
|
||||
|
||||
// ── Builtin arity table ───────────────────────────────────────────────────────
|
||||
// -- Builtin arity table -------------------------------------------------------
|
||||
//
|
||||
// El programs sometimes call runtime builtins with the wrong number of
|
||||
// arguments (e.g. `http_serve(port)` instead of `http_serve(port, handler)`).
|
||||
@@ -1840,7 +1907,7 @@ fn emit_time_violations() -> Void {
|
||||
//
|
||||
// Strategy: a small static table mirrors el_runtime.h. Variadic builtins
|
||||
// (el_list_new, el_map_new, args) and unknown identifiers (user fns,
|
||||
// dynamic dispatch) return -1 → no check. A mismatch records a violation
|
||||
// dynamic dispatch) return -1 -> no check. A mismatch records a violation
|
||||
// in process state, which emit_arity_violations() turns into #error
|
||||
// directives at the top of the generated C.
|
||||
fn builtin_arity(name: String) -> Int {
|
||||
@@ -2044,7 +2111,7 @@ fn builtin_arity(name: String) -> Int {
|
||||
if str_eq(name, "get") { return 2 }
|
||||
if str_eq(name, "map_get") { return 2 }
|
||||
if str_eq(name, "map_set") { return 3 }
|
||||
// -1 sentinel: variadic / unknown / user-defined → no check.
|
||||
// -1 sentinel: variadic / unknown / user-defined -> no check.
|
||||
return -1
|
||||
}
|
||||
|
||||
@@ -2242,7 +2309,7 @@ fn build_int_names_for_params(params: [Map<String, Any>]) -> Bool {
|
||||
|
||||
fn cg_fn(stmt: Map<String, Any>) -> Void {
|
||||
let fn_name: String = stmt["name"]
|
||||
// Skip El's `fn main()` — C provides its own main() for top-level stmts
|
||||
// Skip El's `fn main()` - C provides its own main() for top-level stmts
|
||||
// and a duplicate `el_val_t main(void)` would collide with it.
|
||||
if fn_name == "main" { return }
|
||||
let params = stmt["params"]
|
||||
@@ -2274,8 +2341,8 @@ fn cg_fn(stmt: Map<String, Any>) -> Void {
|
||||
}
|
||||
// Lift the final bare expression into an explicit return so implicit
|
||||
// returns ("fn lex(s) { ... tokens }") actually return their value.
|
||||
// Void-returning functions skip this — wrapping `println(x)` in
|
||||
// `return …` is a C type error.
|
||||
// Void-returning functions skip this - wrapping `println(x)` in
|
||||
// `return -` is a C type error.
|
||||
let body_xformed = body
|
||||
if !str_eq(ret_type, "Void") {
|
||||
let body_xformed = transform_implicit_return(body)
|
||||
@@ -2286,7 +2353,7 @@ fn cg_fn(stmt: Map<String, Any>) -> Void {
|
||||
emit_blank()
|
||||
}
|
||||
|
||||
// ── Top-level codegen ─────────────────────────────────────────────────────────
|
||||
// -- Top-level codegen ---------------------------------------------------------
|
||||
|
||||
fn is_fndef(stmt: Map<String, Any>) -> Bool {
|
||||
let kind: String = stmt["stmt"]
|
||||
@@ -2312,7 +2379,7 @@ fn cgi_arg(value: String, has_value: Bool) -> String {
|
||||
return "EL_NULL"
|
||||
}
|
||||
|
||||
// ── VBD role enforcement ──────────────────────────────────────────────────────
|
||||
// -- VBD role enforcement ------------------------------------------------------
|
||||
//
|
||||
// Scan a function body for direct calls to DHARMA-restricted builtins
|
||||
// (dharma_emit, dharma_field). These may only appear inside @manager fns.
|
||||
@@ -2445,16 +2512,16 @@ fn vbd_has_restricted_call(stmts: [Map<String, Any>]) -> Bool {
|
||||
false
|
||||
}
|
||||
|
||||
// ── Entry point ────────────────────────────────────────────────────────────────
|
||||
// -- Entry point ----------------------------------------------------------------
|
||||
|
||||
fn codegen(stmts: [Map<String, Any>], source: String) -> String {
|
||||
// Detect cgi/service blocks: at most one declarative top-level block.
|
||||
// The block determines the program's CAPABILITY KIND:
|
||||
// "cgi" — full self-formation. Calls all primitives.
|
||||
// "service" — bounded. Cannot call self-formation primitives
|
||||
// "cgi" - full self-formation. Calls all primitives.
|
||||
// "service" - bounded. Cannot call self-formation primitives
|
||||
// (llm_call_agentic, llm_register_tool, dharma_emit,
|
||||
// dharma_field, mindlink-creation).
|
||||
// "utility" — default; no DHARMA membership, no LLM, no agentic.
|
||||
// "utility" - default; no DHARMA membership, no LLM, no agentic.
|
||||
// Codegen enforces this with #error directives at every restricted
|
||||
// call site. The capability boundary is structural: a binary either
|
||||
// CAN or CANNOT do a thing, and the compiler decides at emission time.
|
||||
@@ -2489,7 +2556,7 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
|
||||
}
|
||||
if cgi_count >= 1 {
|
||||
if svc_count >= 1 {
|
||||
emit_line("#error \"El: program declares both cgi and service blocks (mutually exclusive — pick one)\"")
|
||||
emit_line("#error \"El: program declares both cgi and service blocks (mutually exclusive - pick one)\"")
|
||||
}
|
||||
}
|
||||
// Stash the program kind so cg_expr's Call branch can enforce
|
||||
@@ -2509,9 +2576,44 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
|
||||
emit_line("#include <stdint.h>")
|
||||
emit_line("#include <stdlib.h>")
|
||||
emit_line("#include \"el_runtime.h\"")
|
||||
|
||||
// Cross-module forward declarations: for each imported module, emit
|
||||
// #include "module.elh" so Clang sees the function signatures from
|
||||
// that module without needing the full source inlined. The .elh files
|
||||
// are generated by `elc --emit-header` and live in the same dist/
|
||||
// directory as the generated .c files. We use basename only (strip
|
||||
// the directory prefix and .el extension) so the include resolves
|
||||
// correctly regardless of the source tree layout.
|
||||
let imp_n: Int = native_list_len(stmts)
|
||||
let imp_i = 0
|
||||
while imp_i < imp_n {
|
||||
let imp_stmt = native_list_get(stmts, imp_i)
|
||||
let imp_kind: String = imp_stmt["stmt"]
|
||||
if str_eq(imp_kind, "Import") {
|
||||
let imp_path: String = imp_stmt["path"]
|
||||
// Extract basename: find last '/' and strip from there.
|
||||
let imp_path_len: Int = str_len(imp_path)
|
||||
let imp_last_slash: Int = -1
|
||||
let imp_j: Int = 0
|
||||
while imp_j < imp_path_len {
|
||||
let imp_c: String = str_slice(imp_path, imp_j, imp_j + 1)
|
||||
if str_eq(imp_c, "/") { let imp_last_slash = imp_j }
|
||||
let imp_j = imp_j + 1
|
||||
}
|
||||
let imp_base: String = str_slice(imp_path, imp_last_slash + 1, imp_path_len)
|
||||
// Strip .el extension if present.
|
||||
let imp_base_len: Int = str_len(imp_base)
|
||||
let imp_bname: String = imp_base
|
||||
if str_ends_with(imp_base, ".el") {
|
||||
let imp_bname = str_slice(imp_base, 0, imp_base_len - 3)
|
||||
}
|
||||
emit_line("#include \"" + imp_bname + ".elh\"")
|
||||
}
|
||||
let imp_i = imp_i + 1
|
||||
}
|
||||
emit_blank()
|
||||
|
||||
// Forward declarations (skip `main` — C provides its own)
|
||||
// Forward declarations (skip `main` - C provides its own)
|
||||
let n: Int = native_list_len(stmts)
|
||||
let i = 0
|
||||
while i < n {
|
||||
@@ -2535,7 +2637,7 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
|
||||
}
|
||||
emit_blank()
|
||||
|
||||
// Top-level `let` bindings → file-scope storage. El programs use
|
||||
// Top-level `let` bindings -> file-scope storage. El programs use
|
||||
// top-level `let GREETING = "..."` as module constants that any
|
||||
// function below should be able to read. Without this pass, a top-
|
||||
// level Let only declares the name inside main()'s scope and any
|
||||
@@ -2683,7 +2785,7 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
|
||||
let main_decl = cg_stmt(stmt, " ", main_decl)
|
||||
}
|
||||
}
|
||||
// Release AST node after final use — each stmt is fully processed
|
||||
// Release AST node after final use - each stmt is fully processed
|
||||
// by this point (forward decls, fn defs, top-level lets, and now
|
||||
// the main-body pass are all done). Releasing here prevents the
|
||||
// accumulated AST from exhausting memory on large source files.
|
||||
@@ -2706,16 +2808,16 @@ fn codegen(stmts: [Map<String, Any>], source: String) -> String {
|
||||
|
||||
// Emit any accumulated capability-violation #error directives. cc
|
||||
// will fail on the first one and surface the message; placement at
|
||||
// the bottom is fine — preprocessor errors halt the build wherever
|
||||
// the bottom is fine - preprocessor errors halt the build wherever
|
||||
// they appear.
|
||||
emit_cap_violations()
|
||||
// Same for builtin-arity violations: cc halts on the first #error,
|
||||
// so a misuse of a known builtin (wrong arg count) fails the build
|
||||
// with a clear message naming the builtin and its expected arity.
|
||||
emit_arity_violations()
|
||||
// Temporal-type violations (Instant + Instant, Duration + Int, …).
|
||||
// Temporal-type violations (Instant + Instant, Duration + Int, -).
|
||||
emit_time_violations()
|
||||
|
||||
// Return empty string — output was streamed via println
|
||||
// Return empty string - output was streamed via println
|
||||
""
|
||||
}
|
||||
|
||||
+27
-26
@@ -146,6 +146,7 @@ fn keyword_kind(word: String) -> String {
|
||||
if word == "engine" { return "Engine" }
|
||||
if word == "accessor" { return "Accessor" }
|
||||
if word == "vessel" { return "Vessel" }
|
||||
if word == "extern" { return "Extern" }
|
||||
""
|
||||
}
|
||||
|
||||
@@ -156,7 +157,7 @@ fn keyword_kind(word: String) -> String {
|
||||
// Returns { "text": ..., "pos": i }
|
||||
fn scan_digits(chars: [String], start: Int, total: Int) -> Map<String, Any> {
|
||||
let i = start
|
||||
let text = ""
|
||||
let parts: [String] = native_list_empty()
|
||||
let running = true
|
||||
while running {
|
||||
if i >= total {
|
||||
@@ -164,20 +165,20 @@ fn scan_digits(chars: [String], start: Int, total: Int) -> Map<String, Any> {
|
||||
} else {
|
||||
let ch: String = native_list_get(chars, i)
|
||||
if lex_is_digit(ch) {
|
||||
let text = text + ch
|
||||
let parts = native_list_append(parts, ch)
|
||||
let i = i + 1
|
||||
} else {
|
||||
let running = false
|
||||
}
|
||||
}
|
||||
}
|
||||
{ "text": text, "pos": i }
|
||||
{ "text": str_join(parts, ""), "pos": i }
|
||||
}
|
||||
|
||||
// scan_ident — advance i while chars[i] is alphanumeric or underscore
|
||||
fn scan_ident(chars: [String], start: Int, total: Int) -> Map<String, Any> {
|
||||
let i = start
|
||||
let text = ""
|
||||
let parts: [String] = native_list_empty()
|
||||
let running = true
|
||||
while running {
|
||||
if i >= total {
|
||||
@@ -185,14 +186,14 @@ fn scan_ident(chars: [String], start: Int, total: Int) -> Map<String, Any> {
|
||||
} else {
|
||||
let ch: String = native_list_get(chars, i)
|
||||
if is_alnum_or_underscore(ch) {
|
||||
let text = text + ch
|
||||
let parts = native_list_append(parts, ch)
|
||||
let i = i + 1
|
||||
} else {
|
||||
let running = false
|
||||
}
|
||||
}
|
||||
}
|
||||
{ "text": text, "pos": i }
|
||||
{ "text": str_join(parts, ""), "pos": i }
|
||||
}
|
||||
|
||||
// ── Code-bearing string detection + comment strip ────────────────────────────
|
||||
@@ -253,7 +254,7 @@ fn looks_like_code(s: String) -> Bool {
|
||||
fn strip_code_comments(s: String) -> String {
|
||||
let chars: [String] = native_string_chars(s)
|
||||
let total: Int = native_list_len(chars)
|
||||
let out = ""
|
||||
let out_parts: [String] = native_list_empty()
|
||||
let i = 0
|
||||
let in_squote = false
|
||||
let in_dquote = false
|
||||
@@ -269,11 +270,11 @@ fn strip_code_comments(s: String) -> String {
|
||||
if in_js_string {
|
||||
// Backslash escape: consume next char verbatim regardless of which.
|
||||
if ch == "\\" {
|
||||
let out = out + ch
|
||||
let out_parts = native_list_append(out_parts, ch)
|
||||
let next_i = i + 1
|
||||
if next_i < total {
|
||||
let nc: String = native_list_get(chars, next_i)
|
||||
let out = out + nc
|
||||
let out_parts = native_list_append(out_parts, nc)
|
||||
let prev = nc
|
||||
let i = next_i + 1
|
||||
} else {
|
||||
@@ -292,7 +293,7 @@ fn strip_code_comments(s: String) -> String {
|
||||
}
|
||||
}
|
||||
}
|
||||
let out = out + ch
|
||||
let out_parts = native_list_append(out_parts, ch)
|
||||
let prev = ch
|
||||
let i = i + 1
|
||||
}
|
||||
@@ -308,7 +309,7 @@ fn strip_code_comments(s: String) -> String {
|
||||
if next_ch == "/" {
|
||||
// URL guard: prev char ':' means this is "://", not a comment.
|
||||
if prev == ":" {
|
||||
let out = out + ch
|
||||
let out_parts = native_list_append(out_parts, ch)
|
||||
let prev = ch
|
||||
let i = i + 1
|
||||
} else {
|
||||
@@ -360,7 +361,7 @@ fn strip_code_comments(s: String) -> String {
|
||||
}
|
||||
let prev = ""
|
||||
} else {
|
||||
let out = out + ch
|
||||
let out_parts = native_list_append(out_parts, ch)
|
||||
let prev = ch
|
||||
let i = i + 1
|
||||
}
|
||||
@@ -369,23 +370,23 @@ fn strip_code_comments(s: String) -> String {
|
||||
// Open a JS string?
|
||||
if ch == "'" {
|
||||
let in_squote = true
|
||||
let out = out + ch
|
||||
let out_parts = native_list_append(out_parts, ch)
|
||||
let prev = ch
|
||||
let i = i + 1
|
||||
} else {
|
||||
if ch == "\"" {
|
||||
let in_dquote = true
|
||||
let out = out + ch
|
||||
let out_parts = native_list_append(out_parts, ch)
|
||||
let prev = ch
|
||||
let i = i + 1
|
||||
} else {
|
||||
if ch == "`" {
|
||||
let in_btick = true
|
||||
let out = out + ch
|
||||
let out_parts = native_list_append(out_parts, ch)
|
||||
let prev = ch
|
||||
let i = i + 1
|
||||
} else {
|
||||
let out = out + ch
|
||||
let out_parts = native_list_append(out_parts, ch)
|
||||
let prev = ch
|
||||
let i = i + 1
|
||||
}
|
||||
@@ -394,14 +395,14 @@ fn strip_code_comments(s: String) -> String {
|
||||
}
|
||||
}
|
||||
}
|
||||
out
|
||||
str_join(out_parts, "")
|
||||
}
|
||||
|
||||
// scan_string — scan a quoted string literal, handling \" escapes.
|
||||
// Starts AFTER the opening quote. Returns { "text": content, "pos": i_after_close }
|
||||
fn scan_string(chars: [String], start: Int, total: Int) -> Map<String, Any> {
|
||||
let i = start
|
||||
let text = ""
|
||||
let parts: [String] = native_list_empty()
|
||||
let running = true
|
||||
while running {
|
||||
if i >= total {
|
||||
@@ -414,26 +415,26 @@ fn scan_string(chars: [String], start: Int, total: Int) -> Map<String, Any> {
|
||||
if next_i < total {
|
||||
let next_ch: String = native_list_get(chars, next_i)
|
||||
if next_ch == "\"" {
|
||||
let text = text + "\""
|
||||
let parts = native_list_append(parts, "\"")
|
||||
let i = next_i + 1
|
||||
} else {
|
||||
if next_ch == "n" {
|
||||
let text = text + "\n"
|
||||
let parts = native_list_append(parts, "\n")
|
||||
let i = next_i + 1
|
||||
} else {
|
||||
if next_ch == "t" {
|
||||
let text = text + "\t"
|
||||
let parts = native_list_append(parts, "\t")
|
||||
let i = next_i + 1
|
||||
} else {
|
||||
if next_ch == "r" {
|
||||
let text = text + "\r"
|
||||
let parts = native_list_append(parts, "\r")
|
||||
let i = next_i + 1
|
||||
} else {
|
||||
if next_ch == "\\" {
|
||||
let text = text + "\\"
|
||||
let parts = native_list_append(parts, "\\")
|
||||
let i = next_i + 1
|
||||
} else {
|
||||
let text = text + next_ch
|
||||
let parts = native_list_append(parts, next_ch)
|
||||
let i = next_i + 1
|
||||
}
|
||||
}
|
||||
@@ -448,13 +449,13 @@ fn scan_string(chars: [String], start: Int, total: Int) -> Map<String, Any> {
|
||||
let i = i + 1
|
||||
let running = false
|
||||
} else {
|
||||
let text = text + ch
|
||||
let parts = native_list_append(parts, ch)
|
||||
let i = i + 1
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
{ "text": text, "pos": i }
|
||||
{ "text": str_join(parts, ""), "pos": i }
|
||||
}
|
||||
|
||||
// ── Main lexer ────────────────────────────────────────────────────────────────
|
||||
|
||||
@@ -687,6 +687,29 @@ fn parse_stmt(tokens: [Map<String, Any>], pos: Int) -> Map<String, Any> {
|
||||
return make_result({ "stmt": "Return", "value": val }, p)
|
||||
}
|
||||
|
||||
// extern fn declaration (no body — forward declaration for separate compilation)
|
||||
if k == "Extern" {
|
||||
let p = pos + 1
|
||||
let k2: String = tok_kind(tokens, p)
|
||||
if str_eq(k2, "Fn") {
|
||||
let p = p + 1
|
||||
let name: String = tok_value(tokens, p)
|
||||
let p = p + 1
|
||||
let r = parse_params(tokens, p)
|
||||
let params = r["params"]
|
||||
let p = r["pos"]
|
||||
let ret_type = ""
|
||||
let k3: String = tok_kind(tokens, p)
|
||||
if str_eq(k3, "Arrow") {
|
||||
let p = p + 1
|
||||
let kt: String = tok_kind(tokens, p)
|
||||
if str_eq(kt, "Ident") { let ret_type = tok_value(tokens, p) }
|
||||
let p = skip_type(tokens, p)
|
||||
}
|
||||
return make_result({ "stmt": "ExternFn", "name": name, "params": params, "ret_type": ret_type }, p)
|
||||
}
|
||||
}
|
||||
|
||||
// fn definition
|
||||
if k == "Fn" {
|
||||
let p = pos + 1
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
// elb.el — El Build Coordinator
|
||||
// elb.el - El Build Coordinator
|
||||
//
|
||||
// The build system for El programs. Written in El. Builds El.
|
||||
//
|
||||
@@ -16,11 +16,11 @@
|
||||
// 3. For each file: if .el is newer than .elh/.c, compile with elc --emit-header
|
||||
// 4. Link all .c files + el_runtime.c into the final binary
|
||||
//
|
||||
// Each module compiles independently — no 128K-line blobs.
|
||||
// Each module compiles independently - no 128K-line blobs.
|
||||
// Downstream compilations read .elh headers (function signatures only),
|
||||
// not source. Incremental: only recompile what changed.
|
||||
|
||||
// ── Flags ─────────────────────────────────────────────────────────────────────
|
||||
// -- Flags ---------------------------------------------------------------------
|
||||
|
||||
fn flag_bool(argv: [String], name: String) -> Bool {
|
||||
let n: Int = native_list_len(argv)
|
||||
@@ -47,7 +47,7 @@ fn flag_val(argv: [String], name: String, default_val: String) -> String {
|
||||
return default_val
|
||||
}
|
||||
|
||||
// ── Manifest parsing ──────────────────────────────────────────────────────────
|
||||
// -- Manifest parsing ----------------------------------------------------------
|
||||
//
|
||||
// Read the entry file from manifest.el:
|
||||
// build { entry "soul.el" }
|
||||
@@ -100,7 +100,7 @@ fn parse_manifest_name(src: String) -> String {
|
||||
return "out"
|
||||
}
|
||||
|
||||
// ── Path helpers ───────────────────────────────────────────────────────────────
|
||||
// -- Path helpers ---------------------------------------------------------------
|
||||
|
||||
fn dirname_of(path: String) -> String {
|
||||
let n: Int = str_len(path)
|
||||
@@ -148,14 +148,14 @@ fn file_is_newer(a: String, b: String) -> Bool {
|
||||
let cmd: String = "test -f " + b + " && test " + a + " -nt " + b + " && echo yes || echo no"
|
||||
let result: String = str_trim(exec_capture(cmd))
|
||||
if str_eq(result, "yes") { return true }
|
||||
// b doesn't exist — check with test -f
|
||||
// b doesn't exist - check with test -f
|
||||
let exist_cmd: String = "test -f " + b + " && echo exists || echo missing"
|
||||
let exist: String = str_trim(exec_capture(exist_cmd))
|
||||
if str_eq(exist, "missing") { return true }
|
||||
return false
|
||||
}
|
||||
|
||||
// ── Import graph walker ────────────────────────────────────────────────────────
|
||||
// -- Import graph walker --------------------------------------------------------
|
||||
//
|
||||
// Walk import statements in each .el file to build the dependency graph.
|
||||
// Returns a list of absolute paths in topological order (deps before dependents).
|
||||
@@ -219,7 +219,7 @@ fn walk_imports(src_path: String, visited: [String], order: [String]) -> Map<Str
|
||||
return { "visited": visited, "order": order }
|
||||
}
|
||||
|
||||
// ── Build ──────────────────────────────────────────────────────────────────────
|
||||
// -- Build ----------------------------------------------------------------------
|
||||
|
||||
fn compile_module(src_path: String, out_dir: String, elc_bin: String, dry_run: Bool, verbose: Bool) -> Bool {
|
||||
let bname: String = basename_noext(src_path)
|
||||
@@ -234,7 +234,9 @@ fn compile_module(src_path: String, out_dir: String, elc_bin: String, dry_run: B
|
||||
return true
|
||||
}
|
||||
|
||||
let cmd: String = elc_bin + " --emit-header " + src_path + " " + c_out
|
||||
// elc streams C to stdout (collect mode not yet implemented); use
|
||||
// shell redirection so the output lands in the file, not the terminal.
|
||||
let cmd: String = elc_bin + " --emit-header " + src_path + " > " + c_out + " 2>&1"
|
||||
println(" compile " + src_path)
|
||||
|
||||
if dry_run { return true }
|
||||
@@ -244,13 +246,23 @@ fn compile_module(src_path: String, out_dir: String, elc_bin: String, dry_run: B
|
||||
println("elb: compile failed: " + src_path)
|
||||
return false
|
||||
}
|
||||
|
||||
// Move the generated .elh (written next to the source by elc) into
|
||||
// out_dir so that #include "module.elh" lines in the generated .c
|
||||
// files resolve correctly when cc is invoked with -I <out_dir>.
|
||||
let src_elh: String = path_with_ext(src_path, ".elh")
|
||||
let mv_cmd: String = "cp " + src_elh + " " + elh_out + " 2>/dev/null || true"
|
||||
exec_command(mv_cmd)
|
||||
|
||||
return true
|
||||
}
|
||||
|
||||
fn link_binary(c_files: [String], out_bin: String, runtime_path: String, dry_run: Bool) -> Bool {
|
||||
fn link_binary(c_files: [String], out_bin: String, runtime_path: String, out_dir: String, dry_run: Bool) -> Bool {
|
||||
let n: Int = native_list_len(c_files)
|
||||
let parts: [String] = native_list_empty()
|
||||
let parts = native_list_append(parts, "cc -O2 -I " + dirname_of(runtime_path))
|
||||
// Include both the runtime dir (for el_runtime.h) and the output dir
|
||||
// (for module.elh cross-module forward declarations).
|
||||
let parts = native_list_append(parts, "cc -O2 -I " + dirname_of(runtime_path) + " -I " + out_dir)
|
||||
let i = 0
|
||||
while i < n {
|
||||
let f: String = native_list_get(c_files, i)
|
||||
@@ -271,7 +283,7 @@ fn link_binary(c_files: [String], out_bin: String, runtime_path: String, dry_run
|
||||
return true
|
||||
}
|
||||
|
||||
// ── Main ───────────────────────────────────────────────────────────────────────
|
||||
// -- Main -----------------------------------------------------------------------
|
||||
|
||||
fn main() -> Void {
|
||||
let argv: [String] = args()
|
||||
@@ -309,7 +321,7 @@ fn main() -> Void {
|
||||
}
|
||||
}
|
||||
if str_eq(runtime_path, "") {
|
||||
println("elb: cannot locate el_runtime.c — use --runtime=PATH")
|
||||
println("elb: cannot locate el_runtime.c - use --runtime=PATH")
|
||||
exit(1)
|
||||
}
|
||||
|
||||
@@ -357,11 +369,11 @@ fn main() -> Void {
|
||||
|
||||
// Link
|
||||
let out_bin: String = out_dir + "/" + pkg_name
|
||||
let linked: Bool = link_binary(c_files, out_bin, runtime_path, dry_run)
|
||||
let linked: Bool = link_binary(c_files, out_bin, runtime_path, out_dir, dry_run)
|
||||
if !linked {
|
||||
println("elb: link failed")
|
||||
exit(1)
|
||||
}
|
||||
|
||||
println("elb: done → " + out_bin)
|
||||
println("elb: done -> " + out_bin)
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user