298 Commits

Author SHA1 Message Date
will.anderson cd1c6737e8 Replace k3s with direct soul-demo watchdog in Cloud Run container
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 2m11s
Cloud Run gen2 doesn't provide eth0 with a unicast IP, causing k3s flannel
to crash on every container start. k3s was also wrong architecture for
Cloud Run (HPA inside a container, k3s overhead for one process).

Changes:
- entrypoint.sh: replace k3s server with a bash watchdog loop that starts
  soul-demo directly and restarts it on crash (3s backoff)
- Dockerfile.stage: remove k3s binary, soul-demo-image.tar, k3s manifests
  and their associated dirs/envvars; keep soul-demo binary only
- stage.yaml: remove 'Download k3s binary' step; rename and simplify
  soul-demo build step to compile binary only (no OCI image/tar)
- dev.yaml: update soul-demo placeholder step (binary not tar)
- manifest.el: document HAVE_CURL requirement since manifest.el has no
  c_flags/link_flags directive support
2026-05-10 19:46:35 -05:00
will.anderson f27fc2622c Merge pull request 'Fix envelope truncation in http_response when called after fs_read' (#55) from fix/checkout-auth-reveal into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 2m46s
2026-05-11 00:23:28 +00:00
will.anderson 0433fe8c0f Fix http_response() truncating envelope via stale _tl_fs_read_len
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 2m22s
http_response() builds a JSON envelope wrapping the body. If the caller
previously called fs_read() (which sets _tl_fs_read_len = file_size),
http_worker used that stale value as the response copy length — truncating
the larger envelope to the original file size before it reached
http_send_response. The truncated envelope had the body field cut mid-string;
jp_parse_string_raw failed, env_body = "", and http_send_all sent file_size
bytes of garbage past the empty string.

Fix: reset _tl_fs_read_len = 0 at the start of http_response(). The hint
was set for the raw file bytes; the envelope is a new string and must use
strlen() for its length.
2026-05-10 19:23:10 -05:00
will.anderson 9da4d50883 Merge pull request 'Fix JS files served as JSON envelope (checkout/Stripe/auth all broken)' (#53) from fix/checkout-auth-reveal into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 2m4s
Fix JS files served as JSON envelope (checkout/Stripe/auth all broken)
2026-05-10 22:34:32 +00:00
will.anderson c99ca82302 Fix JS files served as raw JSON envelope instead of JavaScript
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 1m36s
http_parse_envelope() called json_parse() on the entire response envelope
(~47KB when body is obfuscated JS). The parser failed on large/complex content,
so is_envelope=0 and the raw JSON was sent — browsers got {"el_http_response":1,...}
instead of executable JavaScript, silently breaking all client-side code.

Fix: replace json_parse-of-full-envelope with a direct field scanner:
- "status" extracted via strtol
- "headers" object extracted via brace-depth scan, then json_parse only that
  small substring (always safe — headers are simple k/v string pairs < 1KB)
- "body" string extracted via jp_parse_string_raw — no intermediate allocation

Also: /js/* route now returns http_response(200, js_headers_json(), content)
with explicit Content-Type: application/javascript so the browser doesn't
apply the json-heuristic (obfuscated JS starting with '[' was detected as JSON,
which with X-Content-Type-Options: nosniff blocks script execution).
2026-05-10 17:32:45 -05:00
will.anderson e292453905 Merge pull request 'Fix checkout auth: free-success panel + Stripe auto-init for paid plans' (#51) from fix/checkout-auth-reveal into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 2m12s
Fix checkout auth: free-success panel + Stripe auto-init for paid plans
2026-05-10 22:00:55 +00:00
will.anderson 0263e51407 Fix checkout: show free-success when logged in; init Stripe without auth on paid plans
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 1m47s
- revealPaymentForm: for free plan, show #free-success panel (was doing nothing,
  leaving page blank when user already had a Supabase session)
- checkExistingSession: for paid plans with no session, call initStripe immediately —
  auth is optional, the payment form shouldn't wait indefinitely
- Guard _formRevealed: prevent double-call from handleAuthRedirect + checkExistingSession
2026-05-10 16:59:51 -05:00
will.anderson b4935ed880 Merge pull request 'Fix http handler not found: pre-register via constructor' (#49) from fix/entrypoint-k3s-nonblocking into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 2m29s
Merge PR #49: Fix http handler not found
2026-05-10 18:36:47 +00:00
will.anderson 9a6f0defd1 Fix http handler not found: pre-register via el_runtime_register_handler
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 1m54s
elb links without -rdynamic so dlsym(RTLD_DEFAULT, "handle_request")
returns NULL at runtime. http_set_handler stores the name as active but
never finds a function pointer, causing every request to return
"el-runtime: no http handler registered" even after http_serve is called.

Fix: add a __attribute__((constructor)) in web_stubs.c that calls
el_runtime_register_handler("handle_request", handle_request) directly,
bypassing dlsym entirely. The handler is in the registry before main()
runs, so http_lookup_active() finds it on the first request.
2026-05-10 13:36:05 -05:00
will.anderson ee0147869e Merge pull request 'Fix GLIBC_2.38 mismatch: switch base image to ubuntu:24.04' (#47) from fix/entrypoint-k3s-nonblocking into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 2m23s
Fix GLIBC_2.38 mismatch: switch base image to ubuntu:24.04
2026-05-10 18:01:57 +00:00
will.anderson 740382fca1 Fix GLIBC_2.38 mismatch: switch base image to ubuntu:24.04
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 2m13s
CI runner (Ubuntu 24.04, glibc 2.39) produces binaries that require
GLIBC_2.38+. debian:bookworm-slim ships glibc 2.36 which doesn't have
the GLIBC_2.38 versioned symbols — container crashes immediately with
"version GLIBC_2.38 not found". Switch to ubuntu:24.04 (glibc 2.39)
to match the build environment. Also updates libcurl4/libssl3 package
names to their Ubuntu 24.04 canonical t64 forms.
2026-05-10 13:01:38 -05:00
will.anderson 25f6631049 Merge pull request 'Non-blocking entrypoint: start neuron-web before k3s is ready' (#45) from fix/entrypoint-k3s-nonblocking into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 2m43s
Non-blocking entrypoint: start neuron-web before k3s is ready
2026-05-10 17:54:54 +00:00
will.anderson 180acc92a0 Non-blocking entrypoint: start neuron-web before k3s is ready
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 2m11s
k3s fails to start in Cloud Run gen2 with "unable to select an IP from
default routes" because Cloud Run's network sandbox doesn't expose a
standard default route for k3s to detect. The blocking wait on k3s
prevented neuron-web from ever binding port 8080, causing Cloud Run's
startup probe to time out and terminate the container.

Two changes:
1. Add --flannel-iface=eth0 so k3s pins to Cloud Run's eth0 rather than
   walking the routing table to detect a default-route interface.
2. Start neuron-web immediately after launching k3s in background.
   soul-demo becomes available asynchronously; neuron-web handles it
   being temporarily unavailable gracefully.
2026-05-10 12:54:26 -05:00
will.anderson 689062fc87 Single-stage Dockerfile.stage: pre-download k3s on host runner
Dev — Build & local smoke test / build-smoke (push) Failing after 1m20s
2026-05-10 16:26:46 +00:00
will.anderson e6fd110073 Single-stage Dockerfile.stage: pre-download k3s on host runner
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 1m37s
The multi-stage Docker builder (which installed build-essential, compiled
soul-demo, and downloaded k3s inside Docker) was causing RWLayer nil
corruption on the runner's overlay2 driver. Every affected run failed at
apt-get install in the runtime stage after the builder stage completed.

Fix: move k3s download to the CI host runner (same pattern as soul-demo
compilation, which now passes reliably). Dockerfile.stage becomes single-
stage: no apt-get in a builder stage, no network downloads, just COPY of
pre-built binaries. Also adds --no-cache to the main docker build for
consistency with the soul-demo step fix.
2026-05-10 11:26:23 -05:00
will.anderson 5e1344af42 Merge pull request 'Fix soul-demo Docker build: --no-cache to avoid corrupted overlay2 layers' (#41) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Successful in 4m42s
Fix soul-demo Docker build: --no-cache to avoid corrupted overlay2 layers
2026-05-10 15:57:13 +00:00
will.anderson d8acb126f5 Fix soul-demo Docker build: --no-cache to avoid corrupted overlay2 layers
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 27s
2026-05-10 10:56:44 -05:00
will.anderson 87ac67a70e Merge pull request 'Selective Docker prune (preserve build cache) + k3s retry' (#39) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Successful in 4m10s
2026-05-10 02:22:08 +00:00
will.anderson f838e0c8a7 Selective Docker prune to preserve build cache; retry k3s download
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 4m2s
2026-05-09 21:21:52 -05:00
will.anderson e520ba98ca Merge pull request 'Make docker prune non-fatal (concurrent prune race)' (#38) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 14m45s
2026-05-10 01:57:30 +00:00
will.anderson 21ecbca2e6 Make docker prune non-fatal to handle concurrent prune from parallel CI jobs
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 15m15s
2026-05-09 20:57:14 -05:00
will.anderson 38c92e5fc7 Merge pull request 'Fix CI disk exhaustion: docker system prune at job start' (#37) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 24s
2026-05-10 01:55:41 +00:00
will.anderson cee0328db5 Add docker system prune at job start to prevent disk exhaustion
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 17m6s
2026-05-09 20:55:24 -05:00
will.anderson bbfc7cebf7 Merge pull request 'Move soul-demo build after JS compile in stage pipeline' (#36) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 3m13s
2026-05-10 01:50:17 +00:00
will.anderson 4a710ff294 Move soul-demo build after JS compile to prevent Docker memory pressure on elc
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 3m7s
2026-05-09 20:50:01 -05:00
will.anderson f1b5e1bac8 Merge pull request 'Add diagnostics to stage JS compile step' (#34) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Successful in 4m48s
2026-05-10 01:27:20 +00:00
will.anderson b4438fec43 Add diagnostics to stage JS compile step to expose silent failure
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 2m21s
2026-05-09 20:27:05 -05:00
will.anderson aa040d1412 Merge pull request 'Fix soul-demo compile: add -I runtime/ include path' (#32) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Successful in 4m5s
2026-05-10 01:02:36 +00:00
will.anderson d5820c43b0 Fix soul-demo compile: add -I runtime/ for el_runtime.h include path
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 1m42s
2026-05-09 20:02:22 -05:00
will.anderson a1144605f3 Merge pull request 'Build soul-demo image tar before Docker build in stage' (#30) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Successful in 4m28s
2026-05-10 00:53:48 +00:00
will.anderson 43949b20a0 Build soul-demo image tar before Docker build in stage
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 10m38s
Dockerfile.stage COPYs dist/soul-demo-image.tar so k3s can import
soul-demo:local at container startup. Stage CI now compiles soul-demo
from source on the host runner and packages it as an OCI image before
the main Docker build runs.
2026-05-09 19:41:55 -05:00
will.anderson 06b46c2e8f Merge pull request 'Use ci-base:dev for stage SDK extraction' (#28) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Successful in 5m12s
2026-05-10 00:29:00 +00:00
will.anderson ac5838f3dd Use ci-base:dev for stage SDK extraction
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 12m6s
ci-base:latest has a different (older) elb that generates code with
undeclared variables. The web repo targets ci-base:dev which produces
correct C output. Stage must use the same SDK version as dev.
2026-05-09 19:15:24 -05:00
will.anderson c8d1d3e1aa Merge pull request 'Fix stage SDK extraction: use ci-base:latest and repo runtime' (#26) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 14m5s
2026-05-09 23:48:28 +00:00
will.anderson b532519ad7 Fix stage SDK extraction: use ci-base:latest and repo runtime
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 1m57s
ci-base:stage tag doesn't exist — only :latest and :dev do. Also
apply the same EL_RUNTIME fix as dev.yaml: point at workspace
runtime/ so stage picks up the web stub forward declarations.
2026-05-09 18:45:57 -05:00
will.anderson b27aab20ee Merge pull request 'Fix stage source check: run after checkout' (#24) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Successful in 3m52s
2026-05-09 23:40:02 +00:00
will.anderson 345f9be81a Fix stage source check: run after checkout, not before
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 1m29s
git log -1 fails with 'not a git repository' when the workspace
hasn't been checked out yet. Move the Enforce dev-only source step
to after the Checkout step.
2026-05-09 18:37:55 -05:00
will.anderson 17e14a9fda Merge pull request 'Use repo runtime dir for EL_RUNTIME in push builds' (#22) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Successful in 3m34s
2026-05-09 23:17:49 +00:00
will.anderson e7c1c922f7 Use repo runtime dir for EL_RUNTIME in push builds
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 1m56s
ci-base's el-compiler/runtime doesn't have the web-specific forward
declarations added to runtime/el_runtime.h. Point EL_RUNTIME at the
workspace runtime/ so push builds pick up the same header as PR builds.
2026-05-09 18:15:18 -05:00
will.anderson 954dc1d86e Merge pull request 'Add forward declarations for web stub functions to el_runtime.h' (#21) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 5m4s
2026-05-09 23:07:22 +00:00
will.anderson a83efcda93 Guard web stub declarations with EL_SOUL_DEMO_BUILD to avoid soul-demo conflict
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 2m22s
2026-05-09 18:04:24 -05:00
will.anderson 839c002ce0 Add missing forward declarations to el_runtime.h for web stub functions
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 1m55s
2026-05-09 18:00:29 -05:00
will.anderson 0abef440fa Merge pull request 'Fix implicit declaration of page_close on Linux' (#20) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 4m9s
2026-05-09 22:54:05 +00:00
will.anderson 9892d89c01 Fix implicit declaration of page_close on Linux: wrap extern as native El fn
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 3m25s
2026-05-09 17:49:15 -05:00
will.anderson 47163f690b Merge pull request 'Fix stage source check to use git parents' (#19) from fix/stage-source-check into dev
Dev — Build & local smoke test / build-smoke (push) Failing after 4m25s
2026-05-09 22:41:32 +00:00
will.anderson dc36fe0157 Skip smoke test for PR builds — compile+image-build is sufficient gate
Dev — Build & local smoke test / build-smoke (pull_request) Successful in 2m6s
2026-05-09 17:39:04 -05:00
will.anderson fa65f7783e Split page_css.c EL_STR into 18 chunks via el_str_concat to fix runtime segfault
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 3m15s
2026-05-09 17:27:58 -05:00
will.anderson b63aa5027b Fix dev CI smoke test: run binary directly, skip Docker runtime
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 2m25s
The runner compiles neuron-landing against glibc 2.38 but the Docker
base image ships an older glibc — binary crashes on exec inside the
container. Docker build step already validates the image; smoke test
just needs an HTTP 200, so run the binary directly on the runner instead.
2026-05-09 16:33:29 -05:00
will.anderson 1110ff2e8c Add SKIP_K3S escape hatch for dev CI smoke test
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 2m21s
k3s requires kernel capabilities (overlayfs) that aren't available in
the CI runner's unprivileged Docker environment. Entrypoint now checks
SKIP_K3S=1 and starts neuron-web directly, bypassing k3s and soul-demo.
Dev CI smoke test sets this flag — prod images are unaffected.
2026-05-09 16:22:40 -05:00
will.anderson a51a16c4da Fix dev CI: touch soul-demo-image.tar placeholder before Docker build
Dev — Build & local smoke test / build-smoke (pull_request) Failing after 3m5s
2026-05-09 16:17:18 -05:00