Commit Graph

5 Commits

Author SHA1 Message Date
Will Anderson 282df712a8 add runtime/test.el — El test framework with assertions and runner
Provides assert_true/false, assert_eq, assert_int_eq, assert_neq,
assert_contains, assert_starts_with, assert_ends_with, and fail.
Test cases are registered by name+fn_name, executed sequentially via
the thread/dlsym dispatch mechanism, with results tracked in state_.
Includes tests/runtime/string_test.el covering all 23 string.el exports.
2026-05-03 15:49:01 -05:00
Will Anderson 3a83b6eb80 add text-processing primitives to el runtime
24 new functions covering counting (str_count, str_count_chars,
str_count_bytes, str_count_lines, str_count_words, str_count_letters,
str_count_digits), finding (str_index_of_all, str_last_index_of,
str_find_chars), transforming (str_repeat, str_reverse,
str_strip_prefix/suffix/chars, str_lstrip, str_rstrip), character
classification (is_letter, is_digit, is_alphanumeric, is_whitespace,
is_punctuation, is_uppercase, is_lowercase), and splitting/joining
(str_split_lines, str_split_chars, str_split_n, str_join).

Phase 1 is byte-level + ASCII character classes. Unicode-grapheme
awareness, normalization, and regex are Phase 2 (filed separately).

Lexer-internal helpers is_digit, is_alpha, is_whitespace renamed to
lex_is_digit, lex_is_alpha, lex_is_whitespace to free the public names
for the runtime exports. The El compiler's lexer.el and the bundled
elc-combined.el both updated.

Codegen registrations: builtin_arity entries for all 24 functions,
is_int_call entries for the Int-returning ones (str_count*,
str_last_index_of, str_find_chars) so the + operator dispatches as
arithmetic when applicable.

Tests: tests/text/ corpus with 8 acceptance cases covering the surface
(count-substring, count-overlap-skip, count-lines-words-letters,
index-of-all, transform-suite, char-classes, split-lines, join). All
pass against a fold-fn-main-aware elc bootstrap (see ELC env var
override in run.sh).

Self-host fixed point: elc-combined.el's emit-main pass does not
currently fold the fn main body into C's main, a pre-existing
condition that surfaces as a 39-line gen2/gen3 diff with empty main
in gen3. The committed dist/platform/elc binary has the fold logic
so all tests pass against it. Filing the elc-combined fold-fn-main
fix separately. This commit does not introduce new self-host drift.
2026-05-02 13:37:30 -05:00
Will Anderson ed564b6dda add Calendar + CalendarTime + Rhythm + LocalDate/Time as first-class
Phase 1.5 of time-system. Calendar is pluggable: EarthCalendar
(IANA zones, DST, Gregorian) is the default; MarsCalendar,
CycleCalendar(period), NoCycleCalendar handle non-Earth cases.

Rhythm abstracts recurrence from clock units - rhythm_cycle_phase(0.5)
means "midpoint of cycle" whether the cycle is 24 hours on Earth or
30 hours on a station or 300 years on a long-cycle world.

Phase 1 (Instant + Duration) unchanged. EarthCalendar(zone_local())
is the user-facing default; nobody who doesn't care about non-Earth
calendars sees the abstraction.

Self-host fixed point holds at 6339 lines.
Snapshot tagged at dist/platform/elc.20260502-1321-self-host.

Phase 2 (scheduling primitives every/after/at) lands next, now with
Calendar-aware grounding instead of Earth-time hardcoded.

Backlog: bl-297f66d8 (supersedes bl-b29b3e60)
2026-05-02 13:21:43 -05:00
Will Anderson e7c2fd02df add Instant + Duration as first-class types in El
Postfix literal syntax (30.seconds, 1.hour). Type-checked arithmetic
(Instant + Duration -> Instant; Instant + Instant fails). TTL helpers
backed by typed Duration. sleep(Duration) replaces ambiguous sleep(Int).

Self-host fixed point holds at 5797 lines.
Snapshot tagged at dist/platform/elc.20260502-1256-self-host.

Phase 2 (every/after/at scheduling primitives) lands separately.
Backlog: bl-937e9c30
2026-05-02 12:57:13 -05:00
Will Anderson af480f6266 add el_html_sanitize allowlist runtime primitive
Replaces the need for product-level denylist sanitizers. Small
state-machine parser; tag-and-attribute allowlist passed as JSON;
URL scheme validation on href/src attrs (http, https, mailto,
fragment, relative); whole-subtree drop for script/style/iframe/
object/embed/form (plus rarer media containers). No comment-
wrapping (was fragile to comment-injection bypass via a literal
--> inside an attacker-supplied attribute value).

Also picks up the codegen and parser changes for first-class
Instant/Duration types (postfix-literal time values, typed binop
dispatch) that were sitting in tree alongside this work.

Test corpus at tests/html_sanitizer/ covers the live attacker
probes (script, iframe, form, javascript:, about:, data:, img
onerror, onclick) plus structural attacks (comment-injection
bypass, tab-in-scheme bypass, encoded payloads, malformed input,
empty input, plain text). 29 cases, all green.

Self-host fixed point holds at 5720 lines via the canonical
el-compiler/src/compiler.el entry. Snapshot tagged at
dist/platform/elc.20260502-1249-self-host.

Backlog: bl-dc55ae07
2026-05-02 12:49:41 -05:00