feat: embed k3s to run soul-demo as self-healing k8s pods #9

Closed
will.anderson wants to merge 1 commits from feat/k3s-embedded-soul into stage
Owner

Summary

  • Embeds k3s (v1.32.4) in the neuron-web Docker image so soul-demo runs as a managed Kubernetes Deployment instead of a bare background process
  • k3s starts first in entrypoint.sh, imports the pre-bundled soul-demo:local OCI tar (no registry needed), and auto-applies the Deployment + NodePort Service + HPA from the server/manifests dir
  • neuron-web only starts after the soul-demo pod reports Running — clean startup sequencing

Architecture changes

  • soul-demo — now a k3s Deployment (1–8 replicas, HPA at 60% CPU), restarts automatically on crash, liveness/readiness probes on /healthz:7772
  • neuron-web — unchanged, still calls localhost:7772 via the k3s NodePort service
  • Build pipelinebuild-stage.sh gains a post-build step: extracts the soul-demo binary from the just-built image, builds soul-demo:local via dist/Dockerfile.soul-demo, saves it as dist/soul-demo-image.tar, which is then COPY'd into the final image
  • Cloud Run — all deploys (stage + prod) now use --execution-environment gen2; required for k3s (/dev/kmsg + Linux capabilities not available on gen1/gVisor)

New files

  • dist/Dockerfile.soul-demo — minimal image for soul-demo (debian:bookworm-slim + binary + snapshot)
  • dist/k3s-soul-demo.yaml — Deployment, NodePort Service (nodePort 7772), and HPA manifests

Test plan

  • Run ./build-stage.sh dev locally — verify dist/soul-demo-image.tar is produced and sized reasonably
  • Smoke test stage deployment: confirm GET /healthz on the Cloud Run URL returns 200 after k3s + soul-demo start
  • Verify k3s kubectl get pods shows soul-demo Running inside the container (docker exec)
  • Kill the soul-demo process inside the container and confirm k3s restarts it within 15s
  • Confirm neuron-web chat still reaches soul-demo at localhost:7772
## Summary - Embeds k3s (v1.32.4) in the neuron-web Docker image so soul-demo runs as a managed Kubernetes Deployment instead of a bare background process - k3s starts first in entrypoint.sh, imports the pre-bundled `soul-demo:local` OCI tar (no registry needed), and auto-applies the Deployment + NodePort Service + HPA from the server/manifests dir - neuron-web only starts after the soul-demo pod reports Running — clean startup sequencing ## Architecture changes - **soul-demo** — now a k3s Deployment (1–8 replicas, HPA at 60% CPU), restarts automatically on crash, liveness/readiness probes on `/healthz:7772` - **neuron-web** — unchanged, still calls `localhost:7772` via the k3s NodePort service - **Build pipeline** — `build-stage.sh` gains a post-build step: extracts the soul-demo binary from the just-built image, builds `soul-demo:local` via `dist/Dockerfile.soul-demo`, saves it as `dist/soul-demo-image.tar`, which is then COPY'd into the final image - **Cloud Run** — all deploys (stage + prod) now use `--execution-environment gen2`; required for k3s (`/dev/kmsg` + Linux capabilities not available on gen1/gVisor) ## New files - `dist/Dockerfile.soul-demo` — minimal image for soul-demo (debian:bookworm-slim + binary + snapshot) - `dist/k3s-soul-demo.yaml` — Deployment, NodePort Service (nodePort 7772), and HPA manifests ## Test plan - [ ] Run `./build-stage.sh dev` locally — verify `dist/soul-demo-image.tar` is produced and sized reasonably - [ ] Smoke test stage deployment: confirm `GET /healthz` on the Cloud Run URL returns 200 after k3s + soul-demo start - [ ] Verify `k3s kubectl get pods` shows soul-demo Running inside the container (`docker exec`) - [ ] Kill the soul-demo process inside the container and confirm k3s restarts it within 15s - [ ] Confirm neuron-web chat still reaches soul-demo at `localhost:7772`
will.anderson added 1 commit 2026-05-07 05:57:24 +00:00
soul-demo now runs as a k3s Deployment with HPA (1–8 replicas, 60% CPU
target) instead of a bare background process. k3s starts first in
entrypoint.sh, imports the soul-demo:local OCI tar from
/var/lib/rancher/k3s/agent/images, and auto-applies the Deployment,
NodePort Service, and HPA from the server/manifests dir. neuron-web
starts only after the soul-demo pod is Running. Cloud Run gen2 execution
environment required for k3s (provides /dev/kmsg and Linux capabilities).
will.anderson closed this pull request 2026-05-07 06:10:07 +00:00

Pull request closed

Please reopen this pull request to perform a merge.
Sign in to join this conversation.
No Reviewers
No labels
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: neuron-technologies/neuron-web#9