Rename inference endpoint to neuron.neurontechnologies.ai, add wildcard DNS for customer orgs

This commit is contained in:
Will Anderson
2026-04-28 11:15:11 -05:00
parent 9e2646701c
commit 1a8e039ecf
15 changed files with 166 additions and 294 deletions
+118 -249
View File
@@ -6,220 +6,172 @@ Quick reference for AI agents working in Will's infrastructure. Read this before
## Machines
| Machine | Role | IP | SSH |
|---------|------|----|-----|
| Local Mac | Dev workstation | 192.168.68.55 | — |
| Legion | Ubuntu 24.04, k3s server | 192.168.68.77 | `ssh legion` |
| Machine | Role |
|---------|------|
| Local Mac | Dev workstation (primary) |
Legion runs all production services: k3s, Postgres, Redis, Ollama (GPU), Docker Registry, GitHub Actions runner, Vault, Gitea, AdGuard, Neuron.
Legion is gone. All production services now run on GCP.
---
## Secrets — Always Use Vault
## Production Environment — GCP
**Vault is the source of truth for all credentials.** Never hardcode secrets. Never ask the user for a secret value.
All production services run on Google Cloud Platform, project `neuron-785695`.
### CLI access (on local Mac)
| Service | URL | Platform |
|---------|-----|----------|
| Marketing site | https://neurontechnologies.ai | Cloud Run (3 regions) |
| Accounts | https://accounts.neurontechnologies.ai | Cloud Run (3 regions) |
| REST API | https://api.neurontechnologies.ai | Cloud Run (3 regions) |
| Soma (AI inference gateway) | https://ai.neurontechnologies.ai | Cloud Run (us-central1) |
### Artifact Registry
Images are pushed to `us-central1-docker.pkg.dev/neuron-785695/`:
| Service | Registry path |
|---------|---------------|
| Marketing | `neuron-marketing/marketing` |
| Accounts | `neuron-accounts/accounts` |
| API | `neuron-api/api` |
| Soma | `neuron-soma/soma` |
### GCP CLI
```bash
# Load all secrets into env
set -a; source ~/Secrets/credentials/infrastructure.env; set +a
# Or with direnv (auto-loads in legion/ directory):
cd ~/Development/infrastructure/servers/legion/
# direnv auto-sources infrastructure.env
# Friendly alias (after sourcing infrastructure.env):
secret civitai # → civitai API key
secret slack-bot-token # → Slack bot token
secret cf-key # → Cloudflare API key
# Direct vault lookup:
vault kv get -field=api_key secret/ai # CivitAI
vault kv get -field=bot_token secret/slack # Slack
vault kv get -field=api_key secret/cloudflare # Cloudflare
gcloud auth login
gcloud config set project neuron-785695
gcloud auth configure-docker us-central1-docker.pkg.dev --quiet
```
Full path+field table is in `~/Secrets/credentials/infrastructure.env` — look at the `_v()` calls.
---
### Vault connection
## Secrets — GCP Secret Manager
**GCP Secret Manager is the source of truth for all production secrets.**
### Access via gcloud CLI
```bash
VAULT_ADDR=https://vault.neuralplatform.ai
VAULT_TOKEN=$(cat ~/Secrets/tokens/vault-root-token)
gcloud secrets versions access latest --secret=<secret-name>
```
If Vault is unreachable (CF tunnel down), port-forward:
```bash
ssh legion -L 8200:localhost:8200 # in background
export VAULT_ADDR=http://localhost:8200
```
### Secrets in use
### How secrets flow to pods
| Secret name | Purpose |
|-------------|---------|
| `jwt-secret` | JWT signing key (accounts, api) |
| `license-admin-token` | License admin token (api) |
| `accounts-database-url` | Cloud SQL connection string (accounts) |
| `soma-hf-token` | HuggingFace API token (soma) |
| `soma-operator-key` | Soma operator/service key (soma) |
| `stripe-secret-key` | Stripe API key |
| `stripe-webhook-secret` | Stripe webhook signing secret |
| `stripe-price-professional` | Stripe Professional plan price ID |
| `stripe-price-founding` | Stripe Founding Member price ID |
### How secrets flow to Cloud Run
Secrets are injected as environment variables via Secret Manager references in Terraform Cloud Run service definitions. No ESO, no Vault.
```
Vault (source of truth) → ExternalSecret (in git) → ESO operator → k8s Secret → pod env
GCP Secret Manager → Terraform secret_key_ref → Cloud Run env var → pod env
```
All k8s Secrets are managed by External Secrets Operator (ESO) pulling from Vault.
No secrets are stored in Terraform. Do not add `kubernetes_secret` resources to Terraform.
---
## Infrastructure Management
### The split: Terraform vs Argo CD
### The split: Terraform vs Cloud Run
| Layer | Tool | What it owns |
|-------|------|-------------|
| Infrastructure | **Terraform** | Namespaces, Cloudflare DNS/tunnels, R2 storage, Vault bootstrap |
| Configuration | **Argo CD** | All k8s resources: Deployments, Services, ConfigMaps, Ingresses, PVCs, ExternalSecrets, Helm releases |
| Secrets | **ESO → Vault** | All k8s Secrets — ExternalSecret manifests in git, values pulled from Vault at runtime |
| Infrastructure | **Terraform** | Cloud Run services, Artifact Registry, Secret Manager secrets, Load balancer, SSL certs, Cloudflare DNS, Cloud SQL, IAM |
| Secrets | **Secret Manager** | All production secrets — referenced in Terraform, never hardcoded |
**Never modify k8s resources directly with kubectl.** Always go through Terraform or Argo CD (Git push).
**Never deploy Cloud Run changes with `gcloud run deploy` directly.** All changes go through Terraform.
**Secrets flow:**
```
Vault (source of truth) → ExternalSecret (in git) → ESO operator → k8s Secret → pod env
```
Adding a new secret:
1. `vault kv put secret/<path> field=value`
2. Add ExternalSecret manifest to `k8s/<service>/` referencing that path
3. Push → Argo CD applies ExternalSecret → ESO syncs Secret within ~60s
### Running Terraform
### Running Terraform (GCP)
```bash
cd ~/Development/infrastructure/servers/legion/
terraform plan # uses direnv to load credentials
cd ~/Development/infrastructure/servers/gcp/
# direnv auto-loads credentials if configured, or:
export GOOGLE_APPLICATION_CREDENTIALS=~/.config/gcloud/application_default_credentials.json
terraform plan
terraform apply
```
State is stored in Cloudflare R2 bucket `legion-terraform-state`.
### Argo CD
Root app `legion-apps` watches `servers/legion/apps/` in the `will/infrastructure` Gitea repo.
Push to main → Argo CD syncs within ~30 seconds.
- App manifests: `~/Development/infrastructure/servers/legion/apps/*.yaml`
- k8s config: `~/Development/infrastructure/servers/legion/k8s/<service>/*.yaml`
- Argo CD UI: https://argocd.neuralplatform.ai
- Gitea CLI: `tea pr ls`, `tea issue ls` (configured, default login: neuralplatform)
State is stored in Cloudflare R2 bucket `legion-terraform-state` (key: `gcp/terraform.tfstate`).
### Repo layout
```
~/Development/
infrastructure/ ← this repo (local only, no GitHub remote)
infrastructure/ ← this repo
AGENTS.md ← you are here
servers/ ← nested repo (will/infrastructure on Gitea)
legion/
*.tf ← Terraform (infra layer: namespaces, Cloudflare, R2)
apps/*.yaml ← Argo CD Application manifests (app definitions)
k8s/<service>/*.yaml ← k8s config synced by Argo CD (PVCs, ExternalSecrets, etc.)
neural-platform/
neuron/ github: harmonic-framework/neuron, gitea: neural-platform/neuron
harmonic-framework/
harmonic-framework.com/ ← github: harmonic-framework/harmonic-framework.com
projects/
personal/
prism/ ← github: harmonic-framework/prism, gitea: neural-platform/prism
clients/
ilih.life/ ← github: harmonic-framework/ilih.life, gitea: will/ilih.life
servers/
gcp/
*.tf ← Terraform (all GCP infra)
neuron-technologies/
soma/ ← Soma AI inference gateway (Rust)
neuron-rest/ ← REST API (Go)
accounts/Accounts service (Go)
marketing/ ← Marketing site (Next.js)
```
---
## CI/CD
### Gitea Actions (neuralplatform.ai was on Legion — now gone)
Gitea is gone with Legion. Use GitHub Actions for CI/CD.
- Workflows: `.github/workflows/`
- Push images to GCP Artifact Registry from CI using the `neuron-ci-pusher` service account
- CI SA key stored as `GCP_SA_KEY` repo secret (JSON key from `neuron-ci-pusher@neuron-785695.iam.gserviceaccount.com`)
### Docker push pattern
```bash
gcloud auth configure-docker us-central1-docker.pkg.dev --quiet
docker buildx build --platform linux/amd64 \
-t us-central1-docker.pkg.dev/neuron-785695/<repo>/<service>:latest \
--push .
```
---
## Services & Domains
### Family (nook.family — via Cloudflare tunnel)
| Domain | Backend | Notes |
|--------|---------|-------|
| neurontechnologies.ai | GCP LB → Cloud Run | Marketing site, 3 regions |
| www.neurontechnologies.ai | GCP LB → Cloud Run | Marketing site alias |
| api.neurontechnologies.ai | GCP LB → Cloud Run | REST API, 3 regions |
| accounts.neurontechnologies.ai | GCP LB → Cloud Run | Accounts service, 3 regions |
| ai.neurontechnologies.ai | GCP LB → Cloud Run | Soma inference gateway, us-central1 |
| Service | URL | Notes |
|---------|-----|-------|
| AdGuard | https://dns.nook.family | DNS + ad blocking (DoH: https://dns.nook.family/dns-query) |
### Platform (neuralplatform.ai — via Cloudflare tunnel)
| Service | URL | Notes |
|---------|-----|-------|
| Argo CD | https://argocd.neuralplatform.ai | GitOps UI |
| Vault | https://vault.neuralplatform.ai | Secrets |
| Gitea | https://git.neuralplatform.ai | Git server |
| Ollama | https://ollama.neuralplatform.ai | LLM API |
| Neuron MCP | https://neuron.neuralplatform.ai | MCP server (port 8001) |
| Axon webhooks | https://axon.neuralplatform.ai | Webhook hub (port 3847) |
| npm registry | https://npm.neuralplatform.ai | Verdaccio |
| PyPI registry | https://pypi.neuralplatform.ai | devpi |
| Docker Registry | https://registry.neuralplatform.ai | Push images here |
| Registry UI | https://docker.neuralplatform.ai | Docker registry browser |
| Headscale VPN | https://vpn.neuralplatform.ai | Tailscale control plane (direct TLS, not CF-proxied) |
| Grafana | https://grafana.neuralplatform.ai | Metrics + logs dashboards |
| Prometheus | — | Metrics (kube-prometheus-stack, internal only) |
| Alertmanager | — | Alert routing → Slack (internal only) |
| Alloy | https://alloy.neuralplatform.ai | OTLP ingest for Loki/Tempo |
### VPN (Headscale / Tailscale)
Headscale runs at `vpn.neuralplatform.ai` (DNS-only, no CF proxy — required for Tailscale TS2021 WebSocket upgrades). Magic DNS base domain: `ts.neuralplatform.ai`. DNS resolvers: `192.168.68.77` (AdGuard) + `1.1.1.1`.
### NodePort services (direct to Legion IP)
| Service | Port | Notes |
|---------|------|-------|
| Gitea SSH | 30022 | `git@gitea:org/repo.git` (via `Host gitea` SSH config) |
| Ollama API | 31434 | `http://192.168.68.77:31434` |
### Gitea SSH config (Mac `~/.ssh/config`)
```
Host gitea
HostName 192.168.68.77
Port 30022
User git
IdentityFile ~/.ssh/id_ed25519
```
Use for git remotes: `git@gitea:will/infrastructure.git`. Works on LAN and over Headscale VPN.
All domains share one global anycast IP (`marketing-ip-prod`). DNS A records in Cloudflare point to that IP.
---
## Access Patterns
## Load Balancer
### SSH
- Global external HTTPS load balancer (`marketing-urlmap-prod`)
- Single IP (`google_compute_global_address.prod`)
- Host-based routing via URL map
- SSL certs: Google-managed, auto-renewed
- Cloud Armor: SQLi/XSS protection + rate limiting (1000 req/min per IP)
- CDN enabled for marketing site only
```bash
ssh legion # Ubuntu 24.04, user: will
ssh legion "kubectl get pods -A" # run kubectl remotely
```
---
### kubectl (local)
## Terraform Cloudflare DNS
```bash
export KUBECONFIG=~/.kube/legion-config
kubectl get pods -A
kubectl logs -n neuron deployment/neuron
```
Cloudflare DNS A records for `neurontechnologies.ai` subdomains are managed in the GCP Terraform (`servers/gcp/`). The Cloudflare provider is configured via environment variable `CLOUDFLARE_API_TOKEN`.
### Gitea CLI (tea)
Use `tea` (installed on both Mac and Legion, default login: `neuralplatform`):
```bash
tea repo ls # list repos
tea pr ls --repo will/infrastructure
tea issue ls --repo neural-platform/neuron
```
### Gitea API (direct)
CF Access blocks direct calls from Mac. Use `tea` or SSH to Legion:
```bash
TOKEN=$(vault kv get -field=api_token secret/gitea)
ssh legion "curl -s -H 'Authorization: token $TOKEN' http://10.43.1.53:3000/api/v1/repos/search?limit=50"
```
Zone ID for `neurontechnologies.ai`: `e844374f203dca4944d77d40ca0710ae`
---
@@ -228,99 +180,16 @@ ssh legion "curl -s -H 'Authorization: token $TOKEN' http://10.43.1.53:3000/api/
| What | Where |
|------|-------|
| This repo | `~/Development/infrastructure/` |
| Infrastructure (Terraform + Argo CD) | `~/Development/infrastructure/servers/` |
| Neural Platform projects | `~/Development/neural-platform/` |
| Harmonic Framework projects | `~/Development/harmonic-framework/` |
| Personal projects | `~/Development/projects/personal/` |
| Client projects | `~/Development/projects/clients/` |
| Knowledge base | `~/Knowledge/` |
| Secrets & tokens | `~/Secrets/` |
| Infrastructure env | `~/Secrets/credentials/infrastructure.env` |
| kubeconfig | `~/.kube/legion-config` |
| Agent directives | `~/.claude/projects/-Users-will/memory/directives.md` |
| Agent memory index | `~/.claude/projects/-Users-will/memory/MEMORY.md` |
---
## GitHub / Gitea
**GitHub** (`github.com/harmonic-framework`) — open-source, CI runners, public-facing
**Gitea** (`git.neuralplatform.ai`) — private/infra, four orgs:
- `will` — personal infra (`will/infrastructure`)
- `neural-platform` — platform projects (`neural-platform/neuron`, `neural-platform/prism`)
- `harmonic-framework` — org site/assets (`harmonic-framework/harmonic-framework.com`)
- `contexthub` — archived reference repos (ContextHub, shut down 2021)
Runner labels: `self-hosted,linux,x64,legion`
---
## Namespaces (k8s)
| Namespace | What |
|-----------|------|
| `dns` | AdGuard (DNS + ad-blocking, port 53) |
| `git` | Gitea |
| `neuron` | Neuron + cloudflared |
| `ollama` | Ollama (GPU inference) |
| `ci` | GitHub Actions runner |
| `packages` | Verdaccio (npm) + devpi (PyPI) |
| `registry` | Docker registry + UI |
| `platform` | Postgres, Redis |
| `monitoring` | kube-prometheus-stack (Prometheus, Grafana, Alertmanager) + Loki + Tempo + Alloy |
| `headscale` | Headscale VPN control plane |
| `vault` | HashiCorp Vault |
| `argocd` | Argo CD |
| `cert-manager` | cert-manager (Let's Encrypt via HTTP-01) |
| `external-secrets` | External Secrets Operator — syncs Vault secrets → k8s Secrets |
---
## Common Operations
### Deploy a config change to an app
1. Edit `servers/legion/apps/<app>.yaml` or `servers/legion/k8s/<service>/<file>.yaml`
2. `git commit && git push` (from `servers/`)
3. Argo CD syncs in ~30s
### Add a new secret to a pod
1. Store the secret: `vault kv put secret/path field=value`
2. Add/update an ExternalSecret in `servers/legion/k8s/<service>/external-secrets.yaml`
3. Reference `secretKeyRef: name: <secret-name>` in the app manifest
4. Push → ESO syncs the k8s Secret within ~60s
### Add a new Helm release
Add an Argo CD Application manifest to `servers/legion/apps/<name>.yaml` with `spec.source.chart` and `spec.source.helm.values`. Push to deploy.
### Rotate a secret
1. `vault kv patch secret/path field=newvalue`
2. Force ESO refresh: `kubectl annotate externalsecret -n <ns> <name> force-sync=$(date +%s) --overwrite`
3. Restart pod if needed: `kubectl rollout restart deployment/<name> -n <ns>`
---
## Network
- Router: TP-Link Deco BE5000 mesh, router mode
- Subnet: `192.168.68.x`
- DHCP DNS: `192.168.68.77` (AdGuard) + `1.1.1.1` fallback
- AdGuard provides `*.nook.family` resolution → `192.168.68.77`
- Cloudflare tunnel ID: `54bc9b05-3953-47a2-9c3e-adecdcc53d51`
- systemd-resolved is **disabled** on Legion (conflicts with AdGuard port 53)
| GCP Terraform | `~/Development/infrastructure/servers/gcp/` |
| Soma (inference gateway) | `~/Development/neuron-technologies/soma/` |
| GCP credentials | `~/.config/gcloud/` |
| Secrets/tokens | `~/Secrets/` |
---
## What NOT To Do
- `kubectl apply` directly — push to Git and let Argo CD sync
- `kubectl edit` — edit the file in `servers/legion/`, push to Git
- Hardcode any secret value — store in Vault, reference via ExternalSecret
- Add `kubernetes_secret` to Terraform — secrets belong in ESO/Vault, not Terraform
- `gcloud run deploy` directly — use Terraform
- Hardcode any secret value — store in GCP Secret Manager
- Commit `terraform.tfstate` — state is in R2, never local
- Call Gitea API from Mac directly — CF Access blocks it; use `tea` CLI or SSH to Legion
- Use `var.X` pattern for secrets in Terraform — Terraform no longer owns k8s Secrets
- Add `kubernetes_*` resources to Terraform — there is no k8s cluster anymore
+6 -2
View File
@@ -313,10 +313,14 @@ resource "google_compute_backend_service" "soma" {
# ── SSL Certificate ───────────────────────────────────────────────────────────
resource "google_compute_managed_ssl_certificate" "soma" {
name = "soma-cert-prod"
name = "soma-cert-prod-v2"
project = var.project_id
managed {
domains = ["ai.neurontechnologies.ai"]
domains = ["neuron.neurontechnologies.ai"]
}
lifecycle {
create_before_destroy = true
}
}
+16 -3
View File
@@ -5,11 +5,24 @@
# Cloudflare provider reads CLOUDFLARE_API_TOKEN from env.
# Zone ID for neurontechnologies.ai is set in terraform.tfvars.
# ── ai.neurontechnologies.ai → Soma inference gateway ────────────────────────
# ── neuron.neurontechnologies.ai → Soma inference gateway ────────────────────
resource "cloudflare_record" "soma_ai" {
resource "cloudflare_record" "soma_neuron" {
zone_id = var.cloudflare_zone_id_neurontechnologies
name = "ai"
name = "neuron"
type = "A"
content = google_compute_global_address.prod.address
proxied = true
ttl = 1
}
# ── *.neurontechnologies.ai → Customer org subdomains ────────────────────────
# Enterprise customers get {org}.neurontechnologies.ai routed to soma.
# Soma reads the Host header to identify the tenant.
resource "cloudflare_record" "soma_wildcard" {
zone_id = var.cloudflare_zone_id_neurontechnologies
name = "*"
type = "A"
content = google_compute_global_address.prod.address
proxied = true
+1 -1
View File
@@ -206,7 +206,7 @@ resource "google_compute_url_map" "prod" {
}
host_rule {
hosts = ["ai.neurontechnologies.ai"]
hosts = ["neuron.neurontechnologies.ai", "*.neurontechnologies.ai"]
path_matcher = "soma"
}
+2
View File
@@ -1 +1,3 @@
cloudflare_zone_id_neurontechnologies = "e844374f203dca4944d77d40ca0710ae"
cloudflare_api_key = "007bbefe03eac0e502c339423c50dd911776a"
cloudflare_email = "andersonwilliam85@gmail.com"
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
@@ -16,3 +16,7 @@ spec:
remoteRef:
key: secret/data/neuron-technologies/dev
property: gitea_webhook_secret
- secretKey: MARKETPLACE_DB_PASSWORD
remoteRef:
key: secret/data/legion-db
property: postgres_password
@@ -16,3 +16,7 @@ spec:
remoteRef:
key: secret/data/neuron-technologies/stage
property: gitea_webhook_secret
- secretKey: MARKETPLACE_DB_PASSWORD
remoteRef:
key: secret/data/legion-db
property: postgres_password
@@ -51,6 +51,10 @@ spec:
remoteRef:
key: secret/data/r2
property: neuron_bucket
- secretKey: MARKETPLACE_DB_PASSWORD
remoteRef:
key: secret/data/legion-db
property: postgres_password
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
@@ -102,6 +106,10 @@ spec:
remoteRef:
key: secret/data/r2
property: neuron_bucket
- secretKey: MARKETPLACE_DB_PASSWORD
remoteRef:
key: secret/data/legion-db
property: postgres_password
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
@@ -153,3 +161,7 @@ spec:
remoteRef:
key: secret/data/r2
property: neuron_bucket
- secretKey: MARKETPLACE_DB_PASSWORD
remoteRef:
key: secret/data/legion-db
property: postgres_password
@@ -26,19 +26,6 @@ spec:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
initContainers:
- name: wait-for-sqlite-clone
image: busybox:1.36
command: ["sh", "-c", "until [ -f /data/neuron.db ]; do echo 'waiting for SQLite clone...'; sleep 2; done; echo 'SQLite ready'"]
volumeMounts:
- name: data
mountPath: /data
securityContext:
runAsNonRoot: true
runAsUser: 1001
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
containers:
- name: neuron-mcp
# Image is updated by swarm CI on each loop iteration
@@ -51,6 +38,7 @@ spec:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop: ["ALL"]
env:
@@ -26,19 +26,6 @@ spec:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
initContainers:
- name: wait-for-sqlite-clone
image: busybox:1.36
command: ["sh", "-c", "until [ -f /data/neuron.db ]; do echo 'waiting for SQLite clone...'; sleep 2; done; echo 'SQLite ready'"]
volumeMounts:
- name: data
mountPath: /data
securityContext:
runAsNonRoot: true
runAsUser: 1001
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
containers:
- name: neuron-mcp
image: registry.neuralplatform.ai/neuron-technologies/neuron-mcp:v0.18.14
@@ -50,6 +37,7 @@ spec:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop: ["ALL"]
env:
@@ -26,19 +26,6 @@ spec:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
initContainers:
- name: wait-for-sqlite-clone
image: busybox:1.36
command: ["sh", "-c", "until [ -f /data/neuron.db ]; do echo 'waiting for SQLite clone...'; sleep 2; done; echo 'SQLite ready'"]
volumeMounts:
- name: data
mountPath: /data
securityContext:
runAsNonRoot: true
runAsUser: 1001
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
containers:
- name: neuron-mcp
image: registry.neuralplatform.ai/neuron-technologies/neuron-mcp:v0.18.14
@@ -50,6 +37,7 @@ spec:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop: ["ALL"]
env: