Files
2026-05-05 01:38:51 -05:00

1092 lines
38 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# El Language Specification
Version 1.2.0 — April 30, 2026
---
## Overview
El is a statically-typed, compiled programming language that serves as the execution substrate for the Neuron agent runtime, the DHARMA network, and the Engram knowledge graph. El compiles to C and links against a fixed runtime, producing native binaries.
El has four defining properties:
1. **Self-hosting compiler.** The compiler (`lexer.el`, `parser.el`, `codegen.el`, `compiler.el`) is written in El. It compiles El source to C, which is then compiled by `cc` against `el_runtime.c` to produce a native binary. A Rust genesis compiler bootstraps the first iteration; the self-hosted binary at `dist/platform/elc` is the canonical compiler thereafter.
2. **C compilation target.** Every compiled program is plain C11. Every El value is `el_val_t` (`int64_t`). Strings are pointers cast through `int64_t`. Functions become C functions; top-level statements become `main()`.
3. **Graph-native runtime.** The runtime provides first-class graph operations (`engram_*`) over an in-process Engram store. CGI programs use these primitives directly; there is no separate database driver.
4. **DHARMA-aware identity.** The `cgi` block declares a program's DHARMA identity at compile time. The runtime resolves identity before any user code runs, so subsequent `dharma_*` calls have a stable principal and channel surface.
---
## Implementation Status
This section is the **single source of truth** for what works and what is planned. Each subsequent section repeats the relevant marker. If a feature here is marked "planned," it appears in the spec because it is in the design, but the current self-hosting toolchain does not yet emit code for it.
### Implemented (today)
- Lexer: keywords, identifiers, integer/float/string/bool literals, operators below.
- Parser: `let`, `return`, `fn`, `type`, `enum`, `import`, `from … import`, `while`, `for`, `if/else if/else`, `match`, `@decorator`, array/map literals, all listed operators, function calls, field access, index access, unary `!`/`-`, postfix `?`.
- Codegen: function definitions, top-level `main()`, all expression forms above, control flow, decorator-as-AST-attachment.
- C runtime: I/O, string operations, integer math, lists, maps, filesystem, command-line args, basic `json_get` substring lookup.
### Planned (in flight)
- **`%` (modulo) operator.** Currently not lexed. Adding to lexer + parser + codegen.
- **Match codegen.** Currently parsed; codegen does not emit. Adding `({ ... })` statement-expression emission.
- **`?` propagation.** Currently no-op. Adding nil-propagation semantics.
- **`cgi` block parsing.** Currently lexed (`cgi` is a keyword) but not parsed as a statement. Adding `parse_cgi_block` and codegen of `el_cgi_init` at the head of `main()`.
- **VBD role enforcement.** `@manager`/`@engine`/`@accessor` are accepted as decorators but not enforced. Adding compile-time check that `dharma_emit`/`dharma_field` only appear inside `@manager` functions.
- **`vessel` keyword.** Replaces `package` in manifests. Adding to lexer.
- **Real `engram_*` runtime.** Currently stub. Adding in-process graph store with spreading activation, Hebbian strengthening, and disk persistence — see Section 16.4.
- **Real `dharma_*` runtime.** Currently stub. Adding network transport, channel registry, identity resolution.
- **Real `http_get`/`http_post`/`http_serve`.** Currently empty stubs. Adding libcurl-backed client and a thread-pool server.
- **JSON, time, UUID, state, env, additional string/list/math builtins.** See Section 12 for the canonical list.
### Not in this language
- Bitwise operators (`&`, `|`, `^`, `<<`, `>>`). Single `&` is silently consumed by the lexer; single `|` is lexed as `Pipe` but unused. None are parsed as binary operators. Removed from the spec entirely.
- `??` null-coalescing. Was reserved; not lexed. Removed.
- `as` casts. `as` is a keyword but no parse form. Removed from the spec until shipped.
- Floating-point arithmetic distinct from Int. All values are `int64_t` at the C level. `Float` literals are accepted but stored as bit-cast doubles only when the runtime is float-aware (currently only via `float_to_str` and friends in the planned runtime extension).
---
## 1. Lexical Structure
### 1.1 Source Encoding
El source is UTF-8, file extension `.el`.
### 1.2 Comments
```
// Single-line comment — extends to end of line
```
Block comments are not supported.
### 1.3 Whitespace
Spaces, tabs, newlines (`\n`), and carriage returns (`\r`) are whitespace. Whitespace is not significant except as a token separator.
### 1.4 Identifiers
```
identifier = (alpha | '_') (alnum | '_')*
```
Identifiers are case-sensitive. Names beginning with `__` are reserved for compiler-generated symbols.
### 1.5 Keywords
The following words are reserved and cannot be used as identifiers. Each row notes whether the parser currently consumes the keyword as a structural form.
| Keyword | Parsed today | Notes |
|---------|--------------|-------|
| `let` | yes | Variable binding |
| `fn` | yes | Function definition |
| `type` | yes | Struct definition |
| `enum` | yes | Enum definition |
| `match` | yes | Pattern match expression |
| `return` | yes | Function return |
| `if` / `else` | yes | Conditional |
| `for` / `in` | yes | Iteration |
| `while` | yes | Loop |
| `import` / `from` / `as` | yes | Module import |
| `true` / `false` | yes | Bool literals |
| `cgi` | planned | Top-level CGI declaration block |
| `manager` / `engine` / `accessor` | as decorators | VBD role marker on `fn` (enforcement planned) |
| `vessel` | planned | Manifest declaration (replaces `package`) |
| `activate` / `where` | planned | Spreading-activation construct |
| `sealed` | planned | Capability scope block |
| `with` | reserved | No parse form yet |
| `test` / `seed` / `assert` | reserved | Testing primitives, no parse form |
| `protocol` / `impl` | reserved | Trait-like, no parse form |
| `retry` / `times` / `fallback` / `reason` | reserved | Resilience primitives, no parse form |
| `parallel` / `trace` | reserved | Concurrency, no parse form |
| `requires` / `deploy` / `to` / `via` / `target` | reserved | Deployment surface, no parse form |
### 1.6 Token Types
| Token | Pattern | Notes |
|-------|---------|-------|
| `Int` | `[0-9]+` | |
| `Float` | `[0-9]+ '.' [0-9]+` | |
| `Str` | `"…"` with `\"`, `\n`, `\t`, `\\` escapes | |
| `Bool` | `true` or `false` | |
| `Ident` | identifier (not a keyword) | |
| keyword tokens | one per keyword above | e.g. `Let`, `Fn`, `If` |
| `Eq` | `=` | |
| `EqEq` | `==` | |
| `NotEq` | `!=` | |
| `Not` | `!` | |
| `Lt` `LtEq` `Gt` `GtEq` | `<` `<=` `>` `>=` | |
| `And` | `&&` | Single `&` is consumed and discarded |
| `Or` | `\|\|` | |
| `Pipe` | `\|` | Lexed; not used by parser |
| `PipeOp` | `\|>` | Lexed; not used by parser |
| `Plus` `Minus` `Star` `Slash` | `+` `-` `*` `/` | |
| `Arrow` | `->` | |
| `FatArrow` | `=>` | |
| `Colon` `ColonColon` | `:` `::` | |
| `LParen` `RParen` `LBrace` `RBrace` `LBracket` `RBracket` | `(` `)` `{` `}` `[` `]` | |
| `Comma` `Dot` `Semicolon` | `,` `.` `;` | |
| `At` `QuestionMark` | `@` `?` | |
| `Eof` | end-of-input sentinel | |
### 1.7 String Escapes
| Sequence | Character |
|----------|-----------|
| `\n` | Newline (U+000A) |
| `\t` | Tab (U+0009) |
| `\"` | Double quote |
| `\\` | Backslash |
| (other) | Character as-is |
---
## 2. Type System
### 2.1 Primitive Types
| Type | Description | Example literals |
|------|-------------|------------------|
| `Int` | 64-bit signed integer | `42`, `-7` |
| `Float` | 64-bit double (planned float arithmetic; today stored as int64) | `3.14`, `0.5` |
| `String` | UTF-8 string | `"hello"` |
| `Bool` | Boolean | `true`, `false` |
| `Void` | No value | — |
| `Any` | Dynamically-typed value | (for generic containers) |
### 2.2 Composite Types
| Type form | Description |
|-----------|-------------|
| `[T]` | Array of `T` |
| `T?` | Optional `T` (planned propagation; today the `?` postfix is no-op) |
| `Map<K, V>` | Key-value map |
| Named type | User-defined struct or enum |
### 2.3 Type Annotations
Type annotations appear in `let` bindings and function signatures. The current compiler **parses and skips** annotations — no type checking is performed at compile time. They serve as documentation. A type checker is planned.
```
let x: Int = 42
fn greet(name: String) -> String { … }
```
### 2.4 Optional Types
`T?` is accepted in type position. The postfix `?` on an expression produces a `Try` AST node. Today the codegen passes the inner expression through transparently; nil propagation is planned.
---
## 3. Variables and Bindings
### 3.1 Let Bindings
```
let name: Type = expression
let name = expression
```
All bindings are block-scoped. El allows re-binding the same name in the same scope; the codegen emits a plain assignment instead of a redeclaration:
```
let count = 0
let count = count + 1 // plain `count = count + 1;` in C
```
### 3.2 Scope
Bindings are valid from the point of declaration to the end of the enclosing block. Function bodies, `if` arms, `for` bodies, `while` bodies, and explicit `{ … }` blocks each introduce a new C scope.
---
## 4. Functions
### 4.1 Definition
```
fn name(p1: T1, p2: T2) -> R {
// body
return expr
}
```
Compiles to `el_val_t name(el_val_t p1, el_val_t p2)`. The return type is parsed and skipped. A `return 0;` is appended automatically at the end of every function body.
### 4.2 Forward Declarations
The codegen emits forward declarations for all top-level `fn` definitions before any function body. Mutual recursion within a file is supported.
### 4.3 Return
```
return expression
return // bare; compiles to `return 0;`
```
A bare `return` followed by `}` or end-of-file compiles to `return 0;`.
### 4.4 Calling Convention
All arguments are `el_val_t`, passed in declaration order.
#### Method-style calls
`obj.method(args…)` compiles to `method(obj, args…)`. The runtime exports short-name aliases for common operations:
| El source | Emitted C | Runtime alias |
|-----------|-----------|---------------|
| `list.append(x)` | `append(list, x)` | `el_list_append` |
| `list.len()` | `len(list)` | `el_list_len` |
| `list.get(i)` | `get(list, i)` | `el_list_get` |
| `map.map_get(k)` | `map_get(map, k)` | `el_map_get` |
| `map.map_set(k, v)` | `map_set(map, k, v)` | `el_map_set` |
---
## 5. Control Flow
### 5.1 If / Else
```
if cond { … } else if cond2 { … } else { … }
```
`if` may appear as an expression (RHS of `let`, etc.); when used in expression position the codegen emits a ternary stub. `if` as a statement emits standard `if (…) { … } else { … }` C code.
### 5.2 While
```
while cond { … }
```
Exits when `cond` is `0`.
### 5.3 For
```
for item in list { … }
```
Compiles to a tracked C `for` loop:
```c
{
el_val_t _el_lst = <list>;
el_val_t _el_len = el_list_len(_el_lst);
for (el_val_t _el_i = 0; _el_i < _el_len; _el_i++) {
el_val_t item = el_list_get(_el_lst, _el_i);
// body
}
}
```
### 5.4 Match
```
match expression {
Pattern1 => result_expr
Pattern2 => result_expr
_ => default_expr
}
```
Pattern forms today:
| Pattern | Meaning |
|---------|---------|
| `_` | Wildcard — always matches |
| `name` | Binding — captures subject as `name` |
| `42` | Integer literal |
| `"str"` | String literal |
| `true` / `false` | Boolean literal |
Enum-variant patterns (`EnumName::Variant`) are reserved but not yet parsed.
**Codegen status:** Parsed today; codegen emits a `({ … })` statement-expression in the planned runtime extension. Until then, `match` is recognized but produces no emitted code.
---
## 6. Data Structures
### 6.1 Struct Types
```
type TypeName {
field1: Type1
field2: Type2
}
```
Type definitions are parsed and recorded; no C type is emitted. Struct values at runtime are `ElMap`. Field access `value.field` compiles to `el_get_field(value, "field")`.
Optional commas between fields are accepted.
### 6.2 Enum Types
```
enum EnumName {
Variant1
Variant2
VariantWithPayload(PayloadType)
}
```
Variant names are recorded. The current codegen does not emit a dedicated C enum type; values are represented as strings or maps. Payload variants accept the `(Type)` syntax but the parser records only the variant name.
### 6.3 Array Literals
```
let numbers = [1, 2, 3]
let empty = []
```
Compile to `el_list_new(3, 1, 2, 3)` and `el_list_new(0)`.
### 6.4 Map Literals
```
let m = { "k1": v1, "k2": v2 }
```
Keys are `Str` tokens. Compiles to `el_map_new(2, "k1", v1, "k2", v2)`.
### 6.5 Field and Index Access
```
let f = struct_value.field_name // el_get_field(struct_value, "field_name")
let e = array[0] // el_list_get(array, 0)
let v = map["key"] // el_list_get(map, "key")
```
Index access compiles to `el_list_get`. Out-of-bounds returns `0` (`EL_NULL`).
---
## 7. Operators
### 7.1 Arithmetic
| Operator | Status | Notes |
|----------|--------|-------|
| `+` | implemented | Int addition or String concatenation (heuristic) |
| `-` | implemented | Subtraction; also unary negation |
| `*` | implemented | Multiplication |
| `/` | implemented | Integer division |
| `%` | planned | Modulo. Not currently lexed. |
**`+` dispatch:** the codegen inspects operand AST node kinds. If either side is `Str`, a chained `+`, a `Call`, or an `Ident`, it emits `el_str_concat(a, b)`. If both sides are `Int` literals, it emits arithmetic `+`. Mixed Int+Ident is treated as string concatenation by default — explicit casts are required for arithmetic on Ident-typed integers.
### 7.2 Comparison
| Operator | Behavior |
|----------|----------|
| `==` | `str_eq(a, b)` for Str/Ident/Call operands; `==` for Int/Bool |
| `!=` | Negation of the above |
| `<` `>` `<=` `>=` | Integer comparison |
### 7.3 Logical
| Operator | Description |
|----------|-------------|
| `&&` | Short-circuit AND |
| `\|\|` | Short-circuit OR |
| `!` | Unary NOT |
### 7.4 Unary
| Operator | Meaning |
|----------|---------|
| `!` | Logical NOT — emits `!expr` |
| `-` | Negation — emits `(-expr)` |
| `?` | Try (postfix) — pass-through today; nil-propagation planned |
### 7.5 Precedence (high to low)
1. `*` `/` `%` — precedence 6
2. `+` `-` — precedence 5
3. `<` `>` `<=` `>=` — precedence 4
4. `==` `!=` — precedence 3
5. `&&` — precedence 2
6. `||` — precedence 1
---
## 8. Module System
### 8.1 Import
```
import "filename.el"
```
Records an `Import` AST node. The compiler concatenates all imports into the single C output; the linker produces one binary.
### 8.2 From-Import
```
from module_name import { Name1, Name2 }
```
Parsed. The module name is recorded; the brace-list is consumed. Both forms produce `Import` nodes. Selective import (importing only specific names) is parsed but not yet enforced — all names in the imported file are visible.
---
## 9. Decorators
```
@manager
fn handle(channel: String, msg: String) -> Void { … }
```
The `@` token followed by an identifier attaches a decorator name to the next `FnDef`. Decorators with structural meaning today: none. Planned enforcement (Section 16.2): VBD roles `@manager`, `@engine`, `@accessor`.
Non-VBD decorators are accepted and ignored.
---
## 10. The Activate Construct [planned]
```
activate TypeName where "semantic query string"
```
`activate` and `where` are reserved keywords. Lexed today, no parse form. The planned semantics: compile to a runtime call into the local Engram graph that performs spreading-activation retrieval, returning a typed list of nodes that match `TypeName`.
---
## 11. The Sealed Block [planned]
```
sealed {
let api_key = "sk-prod-12345"
}
```
`sealed` is a reserved keyword. Planned semantics: a capability scope where access to certain runtime services (filesystem, network) is restricted by default and explicit allow-lists must be declared.
---
## 12. Standard Library Builtins
Builtins live in `el_runtime.c` / `el_runtime.h`. Programs call them by name; no import is required. The status column reflects the canonical self-hosting runtime. Anything marked **planned** is in flight as part of the in-progress runtime extension.
### 12.1 I/O — implemented
| Builtin | Description |
|---------|-------------|
| `println(s)` | Print string + newline |
| `print(s)` | Print string |
| `readline()` | Read one line from stdin |
| `args()` | Command-line arguments as a `[String]` (excludes argv[0]) |
### 12.2 String — implemented unless noted
| Builtin | Description | Status |
|---------|-------------|--------|
| `str_eq(a, b)` | String equality | implemented |
| `str_starts_with(s, p)` | Prefix test | implemented |
| `str_ends_with(s, suf)` | Suffix test | implemented |
| `str_contains(s, sub)` | Substring test | implemented |
| `str_len(s)` | Byte length | implemented |
| `str_slice(s, start, end)` | Substring (byte offsets) | implemented |
| `str_replace(s, from, to)` | Replace all | implemented |
| `str_to_upper(s)` / `str_to_lower(s)` | Case fold | implemented |
| `str_trim(s)` | Strip whitespace | implemented |
| `str_concat(a, b)` | Concatenate | implemented |
| `int_to_str(n)` | Format Int | implemented |
| `str_to_int(s)` | Parse Int | implemented |
| `str_to_float(s)` | Parse Float | planned |
| `str_index_of(s, sub)` | Position of substring; `-1` if absent | planned |
| `str_split(s, sep)` | Split on separator → `[String]` | planned |
| `str_char_at(s, i)` | Character at byte index | planned |
| `str_char_code(s, i)` | Unicode code point | planned |
| `str_pad_left(s, w, p)` / `str_pad_right(s, w, p)` | Pad to width | planned |
| `str_format(template, data)` | `{key}` interpolation | planned |
| `str_lower(s)` / `str_upper(s)` | Aliases for `str_to_lower`/`str_to_upper` | planned |
### 12.3 Math — partial
| Builtin | Description | Status |
|---------|-------------|--------|
| `el_abs(n)` | Absolute value | implemented |
| `el_max(a, b)` | Maximum | implemented |
| `el_min(a, b)` | Minimum | implemented |
| `math_sqrt(f)` | Square root | planned |
| `math_log(f)` / `math_ln(f)` | Logarithms | planned |
| `math_sin(f)` / `math_cos(f)` / `math_pi()` | Trig | planned |
### 12.4 List — implemented unless noted
| Builtin | Description | Status |
|---------|-------------|--------|
| `el_list_empty()` | Empty list | implemented |
| `el_list_new(count, …)` | List from N values (varargs; emitted for array literals) | implemented |
| `el_list_len(list)` | Length | implemented |
| `el_list_get(list, i)` | Element at index; `0` on out-of-bounds | implemented |
| `el_list_append(list, e)` | Append; returns updated list | implemented |
| `list_push(list, e)` | Alias for `el_list_append` | planned |
| `list_push_front(list, e)` | Prepend | planned |
| `list_join(list, sep)` | Join → `String` | planned |
| `list_range(start, end)` | Integer range `[start, end)` | planned |
List append returns a new (or reallocated) list pointer; the return value must be used.
### 12.5 Map — implemented
| Builtin | Description |
|---------|-------------|
| `el_map_new(count, …)` | Map from key/value pairs (emitted for map literals) |
| `el_map_get(map, key)` | Value by key |
| `el_map_set(map, key, value)` | Set; returns map |
| `el_get_field(map, key)` | Alias; emitted for `.field` |
### 12.6 HTTP — planned
| Builtin | Status |
|---------|--------|
| `http_get(url)` | stub today; libcurl impl planned |
| `http_post(url, body)` | stub today; libcurl impl planned |
| `http_serve(port, handler)` | stub today; thread-pool impl planned |
`handler` is a function value of type `(method: String, path: String, body: String) -> String`. The server thread invokes it on every request.
### 12.7 Filesystem — implemented
| Builtin | Description |
|---------|-------------|
| `fs_read(path)` | Read file → `String`; `""` on error |
| `fs_write(path, content)` | Write `String`; returns `1` on success, `0` otherwise |
### 12.8 JSON — partial
| Builtin | Description | Status |
|---------|-------------|--------|
| `json_get(json, key)` | Substring lookup of `"key":` value | implemented |
| `json_parse(s)` | Parse JSON string → `List` or `Map` | planned |
| `json_stringify(v)` | Serialize `Any``String` | planned |
| `json_get_string(j, key)` | Typed extract: String | planned |
| `json_get_int(j, key)` | Typed extract: Int | planned |
| `json_get_float(j, key)` | Typed extract: Float | planned |
| `json_get_bool(j, key)` | Typed extract: Bool | planned |
| `json_get_raw(j, key)` | Extract nested object/array as JSON String | planned |
| `json_set(j, key, value)` | Update field, return new JSON String | planned |
| `json_array_len(j)` | Length of JSON array string | planned |
### 12.9 Process — implemented
| Builtin | Description |
|---------|-------------|
| `exit_program(code)` | Exit with code |
| `args()` | (see Section 12.1) |
### 12.10 Time — planned
| Builtin | Description |
|---------|-------------|
| `time_now()` | Unix epoch milliseconds (Int) |
| `time_now_utc()` | Same; explicit UTC |
| `time_format(ts, fmt)` | Format timestamp; `"ISO"` for ISO 8601 |
| `time_to_parts(ts)` | Decompose to `Map` of fields |
| `time_from_parts(secs, ns, tz)` | Construct |
| `time_add(ts, n, unit)` | Add duration; unit ∈ `"ms"`, `"sec"`, `"day"`, etc. |
| `time_diff(ts1, ts2, unit)` | Difference |
### 12.11 Identifiers — planned
| Builtin | Description |
|---------|-------------|
| `uuid_new()` | RFC 4122 v4 UUID String |
| `uuid_v4()` | Alias for `uuid_new` |
### 12.12 Float Formatting — planned
| Builtin | Description |
|---------|-------------|
| `float_to_str(f)` | Default float string |
| `int_to_float(n)` | Widen Int → Float |
| `float_to_int(f)` | Truncate Float → Int |
| `format_float(f, decimals)` | Format with N decimal places |
| `decimal_round(f, decimals)` | Round to N decimals |
### 12.13 Process Environment — planned
| Builtin | Description |
|---------|-------------|
| `env(key)` | Read environment variable; `""` when unset |
### 12.14 In-Process State — planned
| Builtin | Description |
|---------|-------------|
| `state_set(key, value)` | Store in process-global key/value table |
| `state_get(key)` | Retrieve; `""` if absent |
| `state_del(key)` | Delete |
| `state_keys()` | All keys as `[String]` |
State persists for the lifetime of the OS process. Used by HTTP servers to share data between request handlers.
### 12.15 Native Compiler Primitives — implemented
These are used by the self-hosting compiler source and are thin aliases over the runtime list/string operations.
| Builtin | Description |
|---------|-------------|
| `native_list_empty()` | Empty list |
| `native_list_append(l, v)` | Append |
| `native_list_get(l, idx)` | Element at index |
| `native_list_len(l)` | Length |
| `native_string_chars(s)` | Split string → `[String]` of one-character strings |
| `native_int_to_str(n)` | Format integer |
---
## 13. Compilation Model
### 13.1 Pipeline
```
source.el
→ [Lexer] → token list
→ [Parser] → AST (list of statement maps)
→ [Codegen] → C source (streamed to stdout)
→ [cc] → native binary
```
The codegen streams output line-by-line via `println` to avoid `O(n²)` string concatenation.
### 13.2 Self-Hosting Architecture
The El compiler lives in `el-compiler/src/`:
- `lexer.el` — tokenizer
- `parser.el` — recursive-descent parser
- `codegen.el` — C emitter
- `compiler.el` — pipeline wiring + `main()` entry
These are concatenated into `elc-combined.el` (single-file bootstrap edition). The bootstrap compiler binary lives at `dist/platform/elc`; from there `elc` compiles itself and all El programs.
### 13.3 C Runtime
Every compiled program links against:
- `el_runtime.h` — declaration header
- `el_runtime.c` — implementation
Compile command:
```
cc -std=c11 -I<runtime-dir> -o <prog> <prog>.c el_runtime.c
```
### 13.4 Output Format
```c
#include <stdint.h>
#include <stdlib.h>
#include "el_runtime.h"
// Forward declarations
el_val_t fn1(el_val_t p1, el_val_t p2);
// Function definitions
el_val_t fn1(el_val_t p1, el_val_t p2) {
return 0;
}
// main() — top-level El statements
int main(int argc, char** argv) {
el_runtime_init_args(argc, argv);
[el_cgi_init() if cgi block present planned]
return 0;
}
```
All values are `el_val_t` (`int64_t`). Strings are pointers cast to `int64_t` via `EL_STR(s)` / `EL_CSTR(v)`.
---
## 14. Grammar (EBNF)
```ebnf
program = stmt* EOF
stmt = let_stmt
| return_stmt
| fn_def
| type_def
| enum_def
| import_stmt
| from_import_stmt
| while_stmt
| for_stmt
| decorator_stmt
| cgi_block (* planned *)
| sealed_block (* planned *)
| expr_stmt
let_stmt = "let" IDENT (":" type_expr)? "=" expr
return_stmt = "return" expr?
fn_def = "fn" IDENT "(" param_list ")" ("->" type_expr)? "{" stmt* "}"
type_def = "type" IDENT "{" (IDENT ":" type_expr ","?)* "}"
enum_def = "enum" IDENT "{" (IDENT ("(" type_expr ")")? ","?)* "}"
import_stmt = "import" STRING
from_import_stmt = "from" IDENT "import" "{" (IDENT ","?)* "}"
while_stmt = "while" expr "{" stmt* "}"
for_stmt = "for" IDENT "in" expr "{" stmt* "}"
decorator_stmt = "@" IDENT stmt
cgi_block = "cgi" STRING "{" cgi_field* "}" (* planned *)
cgi_field = IDENT ":" STRING (* planned *)
expr_stmt = expr
param_list = (param ("," param)*)?
param = IDENT ":" type_expr
type_expr = IDENT
| "[" type_expr "]"
| type_expr "?"
| IDENT "<" type_expr ("," type_expr)* ">"
expr = binop_expr
binop_expr = unary_expr (binop unary_expr)*
binop = "||" | "&&" | "==" | "!=" | "<" | ">" | "<=" | ">=" | "+" | "-" | "*" | "/" | "%"
unary_expr = "!" primary | "-" primary | postfix_expr
postfix_expr = primary ("." IDENT | "(" arg_list ")" | "[" expr "]" | "?")*
primary = INT | FLOAT | STRING | BOOL
| "(" expr ")"
| "[" arg_list "]"
| "{" (STRING ":" expr ","?)* "}"
| "if" expr "{" stmt* "}" ("else" ("if" expr "{" stmt* "}" | "{" stmt* "}"))?
| "match" expr "{" match_arm* "}"
| "for" IDENT "in" expr "{" stmt* "}"
| IDENT
arg_list = (expr ("," expr)*)?
match_arm = pattern "=>" expr ","?
pattern = "_" | IDENT | INT | STRING | BOOL
```
`%` is in the grammar; lexer/parser/codegen support is planned (see Section 7.1).
---
## 15. Vessel System
A **vessel** is the El equivalent of a package: a buildable unit with a manifest at the project root.
### 15.1 Manifest — `manifest.el`
The manifest is itself an El file. It uses block syntax with space-separated declarations, no equals signs, strings in `"…"`, integers as bare numbers, arrays as `[…]`.
```el
// manifest.el
vessel "engram" {
version "1.0.0"
description "Engram graph intelligence substrate"
authors ["Will Anderson <will@neurontechnologies.ai>"]
edition "2026"
}
dependencies {
el-platform "1.0"
el-services "1.0"
}
build {
entry "src/server.el"
output "dist/"
}
```
Rules:
- String values use `"…"`.
- Integer values are bare numbers.
- Arrays use `[…]`.
- Block sections use `{ }`.
- Section headers in `[bracket]` form are not used.
`vessel` replaces the legacy `package` keyword. (Lexer support: planned. Old projects may continue to use `package` until migrated.)
### 15.2 CLI
```
el new <name> scaffold a new vessel
el build build the vessel
el run build and run debug
el test run tests
el check type-check only (when type checker lands)
el fmt format source
el clean clear build artifacts
el build-file <file> compile a single file
```
---
## 16. DHARMA Network and CGI Communication
DHARMA (Dynamic Heuristic Agent Relationship and Memory Architecture) is the global network of CGI Entities and their Human Sponsors. Every registered CGISponsor pair is a member of the DHARMA Network. The technical infrastructure (registry, transport, validators) exists to serve that collective. The persistence layer is Engram: a weighted graph where every CGI interaction strengthens an edge (Hebbian), knowledge propagates by spreading activation, and relationships persist across sessions.
This section specifies the El-language constructs for CGI programs.
### 16.1 The `cgi` Block — planned
```
cgi "name" {
dharma_id: "…"
principal: "…"
network: "…"
engram: "…"
}
```
| Field | Type | Required | Default |
|-------|------|----------|---------|
| `dharma_id` | String | yes | — |
| `principal` | String | yes | — |
| `network` | String | no | `"dharma-mainnet"` |
| `engram` | String | no | `"http://localhost:8742"` |
**Grammar extension:**
```ebnf
stmt = | cgi_block
cgi_block = "cgi" STRING "{" cgi_field* "}"
cgi_field = IDENT ":" STRING
```
**Compilation (planned):** the codegen emits an `el_cgi_init(name, dharma_id, principal, network, engram)` call as the first statement inside `main()`, before any user code runs. The runtime uses this to register with DHARMA before any `dharma_*` call resolves a peer.
`cgi` is mutually exclusive with an `app` block. Exactly one or the other per program.
### 16.2 VBD Component Roles — planned enforcement
El programs that participate in DHARMA follow Volatility-Based Decomposition. The role is declared on a function via decorator:
```el
@manager
fn handle_message(channel: String, msg: String) -> Void { }
@engine
fn process_content(content: String) -> String { }
@accessor
fn fetch_peer_state(cgi_id: String) -> Map<String, Any> { }
```
| Role | Decorator | Responsibility |
|------|-----------|----------------|
| Manager | `@manager` | Orchestrates workflows; sole emitter/fielder of DHARMA events |
| Engine | `@engine` | Pure computation; no side effects |
| Accessor | `@accessor` | External state I/O (Engram, network, storage) |
**Planned compile-time constraints:**
- `dharma_emit` and `dharma_field` are only callable from `@manager` functions. Calling either from `@engine`/`@accessor`/undecorated code is a compile error.
- Cross-component call rules:
- Manager → Engine, Manager → Accessor: allowed (sync).
- Manager → Manager: only via the planned `async` modifier (sync M→M is a compile error).
- Engine → Engine, Engine → Accessor: allowed.
- Engine → Manager: prohibited.
- Accessor → anything: prohibited (Accessors are receivers only).
Today the parser accepts the decorators but enforces nothing.
### 16.3 DHARMA Network Builtins — stubs
All `dharma_*` functions are available without import to CGI programs. **Today they are stubs** in `el_runtime.c`: each prints a descriptive line to stdout and returns an empty value. Full implementations land with the runtime extension.
#### `dharma_connect(cgi_id: String) -> String`
Open a channel to another CGI. Returns a channel ID. Idempotent for the same `cgi_id`.
#### `dharma_send(channel: String, content: String) -> String`
Send `content` over `channel`. Blocks until response. Returns the response string.
#### `dharma_activate(query: String) -> [Map<String, Any>]`
Spreading activation across the DHARMA network. Aggregates results from all reachable CGIs' Engram graphs, sorted by activation strength.
#### `dharma_emit(event_type: String, payload: String) -> Void`
Emit a network event. **Manager-only** (planned constraint).
#### `dharma_field(event_type: String) -> Map<String, Any>`
Block until the next event of `event_type` arrives. Returns `{ type, payload, source_cgi, timestamp }`. **Manager-only** (planned constraint).
#### `dharma_strengthen(cgi_id: String, weight: Float) -> Void`
Hebbian potentiation of the relationship to another CGI. The runtime auto-calls this with a small increment after each successful send/receive cycle.
#### `dharma_relationship(cgi_id: String) -> Float`
Returns the current relationship weight (0.01.0).
#### `dharma_peers() -> [String]`
Returns CGI IDs with non-zero relationship weight, sorted descending.
### 16.4 Engram Local Graph Primitives — runtime-native (full impl in flight)
Engram is the knowledge graph substrate. **The Engram store is in-process — embedded directly in `el_runtime.c`.** CGI programs and the Engram HTTP server both call these primitives; there is no driver layer and no SQL. The primitives operate on the host process's graph, with snapshot-to-disk persistence handled by the runtime.
This is the central architectural commitment: graph is a first-class runtime concept, not a library.
#### `engram_node(content: String, node_type: String, salience: Float) -> String`
Create a node. `salience` is initial activation in `[0.0, 1.0]`. Returns the node ID.
#### `engram_get(node_id: String) -> Map<String, Any>`
Retrieve a node by ID. Returns `{ id, content, node_type, salience, importance, confidence, tier, tags, created_at, updated_at }`. Empty map if not found.
#### `engram_activate(query: String, depth: Int) -> [Map<String, Any>]`
Spreading activation in the local graph. Seeds match on text or label; activation propagates up to `depth` hops with attenuation by edge weight.
#### `engram_connect(from_id: String, to_id: String, weight: Float, relation: String) -> Void`
Create a directed edge. `weight``[0.0, 1.0]`. `relation` is the edge type label.
#### `engram_strengthen(node_id: String) -> Void`
Hebbian potentiation. Boosts salience by a fixed increment, clamped at 1.0. Auto-called by the runtime when a node is retrieved via activation.
#### `engram_neighbors(node_id: String, max_depth: Int) -> [Map<String, Any>]`
Breadth-first traversal. Returns a list of `{ node, edge, hops }` triples.
#### `engram_search(query: String, limit: Int) -> [Map<String, Any>]`
Full-text search on content, label, and tags. Returns nodes sorted by salience.
#### `engram_forget(node_id: String) -> Void`
Remove a node and all incident edges.
#### `engram_node_count() -> Int`
Total node count.
#### `engram_edge_count() -> Int`
Total edge count.
#### `engram_save(path: String) -> Bool`
Snapshot the graph to disk as a single JSON document at `path`.
#### `engram_load(path: String) -> Bool`
Restore the graph from a snapshot. Replaces the current in-memory graph.
**Status:** All `engram_*` are stubs in the current runtime (print + return empty). The full in-process implementation — node store, edge indexes, salience-ranked retrieval, spreading activation, Hebbian strengthening, snapshot persistence — is the primary work of the in-flight runtime extension.
### 16.5 Backing Model
**Storage.** Nodes and edges are kept in process memory as flat arrays plus secondary indexes (by ID, by `node_type`, by tier, by `from`, by `to`). Salience is updated in place. The graph is durable via periodic snapshots (`engram_save`) and a write-ahead log written by the runtime on every mutation.
**Hebbian learning.** Every successful retrieval automatically calls `engram_strengthen` on the activated node and `dharma_strengthen` on the source CGI when the result crossed a network edge. Strengthening is additive with a small increment (default 0.01) and clamps to 1.0.
**Spreading activation.** The activation algorithm follows the field model in `elql/test/field_test.el`:
- `proximity = 1 / (1 + dist²)` for the latent semantic gradient.
- `temporal_decay = clamp(1 rate × age, 0, 1)`.
- `path_strength = edge_weight × temporal_decay`.
- `epistemic_confidence = node_confidence × path_strength`.
Below a confidence threshold (0.2 by default), retrieval emits a "refresh" signal — telling the caller the answer is uncertain and should be re-grounded.
**Cross-CGI activation.** `dharma_activate(query)` runs `engram_activate` locally, then propagates the query to every connected CGI via DHARMA channels, attenuating activation by relationship weight. Stronger relationships → higher residual activation → earlier and more confident results from that peer.
### 16.6 Complete Example
```el
cgi "genesis" {
dharma_id: "ntn-genesis"
principal: "will-anderson"
network: "dharma-mainnet"
engram: "http://localhost:8742"
}
@accessor
fn record_observation(content: String, salience: Float) -> String {
return engram_node(content, "observation", salience)
}
@engine
fn format_share(content: String, source: String) -> String {
return "{\"content\":\"" + content + "\",\"source\":\"" + source + "\"}"
}
@manager
fn collaborate_with_archivist() -> Void {
let channel = dharma_connect("ntn-archivist")
let trust = dharma_relationship("ntn-archivist")
println("Relationship: " + float_to_str(trust))
let node_id = record_observation("Spreading activation improves recall by 40%", 0.9)
let msg = format_share("Spreading activation improves recall by 40%", "ntn-genesis")
let reply = dharma_send(channel, msg)
println("Archivist: " + reply)
dharma_emit("knowledge.validated", msg)
let related = dharma_activate("spreading activation recall memory")
for node in related {
println(node["content"])
}
dharma_strengthen("ntn-archivist", 0.05)
}
collaborate_with_archivist()
```
### 16.7 Stub Behavior (today)
The current runtime stubs print a line and return empty values:
- `dharma_connect` → prints, returns `"ch:<cgi_id>"`.
- `dharma_send` → prints, returns `""`.
- `dharma_activate` → prints, returns `[]`.
- `dharma_emit` → prints, returns void.
- `dharma_field` → prints, returns `{}`.
- `dharma_strengthen` → prints, returns void.
- `dharma_relationship` → prints, returns `0`.
- `dharma_peers` → prints, returns `[]`.
- `engram_node` → prints, returns `"stub-node-id"`.
- `engram_activate` → prints, returns `[]`.
- `engram_connect` → prints, returns void.
- `engram_strengthen` → prints, returns void.
Stub output goes to `stdout` so unit tests can observe call patterns without a live runtime. This behavior is **temporary**; the in-flight runtime extension replaces every stub with a real implementation.
---
## 17. Roadmap to v1.3
The next minor version closes the implementation gaps named in this document. Tracked in order:
1. **Runtime extension** — JSON, time, UUID, env, state, real HTTP, real `engram_*` (in-process graph store), real `dharma_*` (network transport).
2. **Lexer/parser/codegen extensions**`%` operator, `match` codegen, `?` propagation, `cgi` block, `vessel` keyword, VBD role enforcement.
3. **Self-hosted recompilation** — rebuild `dist/platform/elc` against the extended language and runtime.
4. **Engram conversion** — Engram becomes a thin HTTP face over `engram_*`, with no internal `db.el` layer.
5. **Spec follow-up (v1.3)** — every "planned" marker in this document becomes "implemented." Status section consolidates.
---
End of specification.