Cloud Run gen2 doesn't provide eth0 with a unicast IP, causing k3s flannel
to crash on every container start. k3s was also wrong architecture for
Cloud Run (HPA inside a container, k3s overhead for one process).
Changes:
- entrypoint.sh: replace k3s server with a bash watchdog loop that starts
soul-demo directly and restarts it on crash (3s backoff)
- Dockerfile.stage: remove k3s binary, soul-demo-image.tar, k3s manifests
and their associated dirs/envvars; keep soul-demo binary only
- stage.yaml: remove 'Download k3s binary' step; rename and simplify
soul-demo build step to compile binary only (no OCI image/tar)
- dev.yaml: update soul-demo placeholder step (binary not tar)
- manifest.el: document HAVE_CURL requirement since manifest.el has no
c_flags/link_flags directive support
k3s fails to start in Cloud Run gen2 with "unable to select an IP from
default routes" because Cloud Run's network sandbox doesn't expose a
standard default route for k3s to detect. The blocking wait on k3s
prevented neuron-web from ever binding port 8080, causing Cloud Run's
startup probe to time out and terminate the container.
Two changes:
1. Add --flannel-iface=eth0 so k3s pins to Cloud Run's eth0 rather than
walking the routing table to detect a default-route interface.
2. Start neuron-web immediately after launching k3s in background.
soul-demo becomes available asynchronously; neuron-web handles it
being temporarily unavailable gracefully.
k3s requires kernel capabilities (overlayfs) that aren't available in
the CI runner's unprivileged Docker environment. Entrypoint now checks
SKIP_K3S=1 and starts neuron-web directly, bypassing k3s and soul-demo.
Dev CI smoke test sets this flag — prod images are unaffected.
soul-demo now runs as a k3s Deployment with HPA (1–8 replicas, 60% CPU
target) instead of a bare background process. k3s starts first in
entrypoint.sh, imports the soul-demo:local OCI tar from
/var/lib/rancher/k3s/agent/images, and auto-applies the Deployment,
NodePort Service, and HPA from the server/manifests dir. neuron-web
starts only after the soul-demo pod is Running. Cloud Run gen2 execution
environment required for k3s (provides /dev/kmsg and Linux capabilities).
The wrapper now logs the response and returns a structured ok/error
shape. Four call sites converge on a single send_email helper.
Resend deliveries verified end to end against
will.anderson@neurontechnologies.ai (delivery IDs 492fa066, 74258223,
69a3d9ab, f6d1c889).
Root cause: http_post_auth in dist/web_stubs.c only set the
Authorization: Bearer header. Resend rejects requests without
Content-Type: application/json with HTTP 422 missing_required_field
because it parses the body as form-urlencoded. The 422 response was
being captured by the El handler but not parsed, so callers logged
the error body and returned ok-200 to the client. Two endpoints also
built malformed JSON by interpolating the raw request body unquoted
into the text field.
Fix:
- Added http_post_auth_json (Bearer + Content-Type: application/json)
alongside http_post_auth in dist/web_stubs.c. Stripe form-POST
callers stay on http_post_auth, JSON callers (Resend now, others
later) move to the json variant.
- New send_email(from_addr, to, subject, html, text) wrapper in
src/main.el. JSON-escapes all user-provided fields, parses the
Resend response into a structured ok/error envelope, and println's
the outcome ([email] sent id=<id>) for Cloud Run log surfaces.
- Refactored four call sites onto the wrapper: /api/enterprise-inquiry,
/api/developer-interest, /api/waitlist, /api/attest, the family
invite branch in /api/family/invite, and both DocuSeal completion
branches in /api/docuseal/webhook/<token>.
- Untracked dist/ source files (web_stubs.c, vessel_stubs.c,
soul-demo.c, entrypoint.sh, engram-snapshot.json) are now committed
- generated artifacts (main.c, binaries) stay ignored. Without this
the next CI rebuild would regress the fix.