feat(deploy): docker + helm chart + CI/CD for staging by themightychris · Pull Request #35 · CodeForPhilly/codeforphilly-ng

themightychris · 2026-05-16T23:39:01Z

Summary

Implements the deploy plan (plans/deploy.md) so the team can stand up a staging environment and follow the same template into production.

Multi-stage Dockerfile at the repo root + entrypoint script that clones/refreshes CFP_DATA_REMOTE then exec's node. Final image is non-root alpine with git, ca-certificates, tini, openssh-client.
Helm chart at deploy/charts/codeforphilly/ with values.yaml / values.staging.yaml / values.production.yaml. One replica, Recreate strategy, PVC for the data working tree, optional PVC for the filesystem private-store (staging), readiness probe at /api/health/ready.
GitHub Actions deploy-staging.yml (push to main → build + helm upgrade) and deploy-production.yml (tag push → build + helm upgrade), both gated by per-environment GitHub Environments + KUBECONFIG_* secrets.
API surfaces for production: new apps/api/src/plugins/static-web.ts mounts the built SPA at CFP_WEB_DIST_PATH with SPA fallback + JSON 404 envelope for /api/*; new GET /api/health/ready returns 503 until stores have loaded.
Operational docs under docs/operations/: deploy.md (image anatomy, boot sequence, bucket provisioning), secrets.md (every runtime secret with generation + rotation), runbook.md ("API won't boot" playbook).

Test plan

Verification before push

npm run type-check — clean across api / web / shared
npm run lint — clean
npm test — clean across api / web / shared
npm run build — clean (apps/web/dist + apps/api/dist)
helm lint deploy/charts/codeforphilly — clean for default, staging, production values
helm template ... — renders valid YAML for both environments

Installed via: npm install --workspace=apps/api @fastify/static Used by the production runtime to serve the built apps/web/dist as a fallthrough for non-/api/* routes — one image, one process per architecture.md's "single Docker image bundles API + static web" claim.

Generated by: asdf set helm 4.1.0 Helm is used by deploy-staging.yml / deploy-production.yml workflows and for local `helm lint` / `helm template` validation against the chart under deploy/charts/codeforphilly.

Adds the boot-path surfaces the deploy plan needs in production: - New plugin apps/api/src/plugins/static-web.ts mounts the built SPA at CFP_WEB_DIST_PATH and installs a notFoundHandler that returns the JSON envelope for unknown /api/* paths and serves index.html with no-cache for everything else (SPA fallback for React Router v7 routes). When CFP_WEB_DIST_PATH is unset (dev / tests) the plugin still installs the JSON-envelope 404 handler so the API contract is consistent. - New env var CFP_WEB_DIST_PATH (optional) — set in the production image to /app/apps/web/dist; unset in dev where Vite owns 5173. - New route GET /api/health/ready — readiness probe for k8s. Returns 200 only after the store + FTS decorators are present (which happens during plugin registration, before fastify.listen()). Returns 503 otherwise so ingress never routes to a pod whose in-memory state hasn't loaded. - Tests in apps/api/tests/deploy.test.ts cover the readiness payload, the SPA fallback / no-cache header, the /api/* JSON-404 envelope with and without the SPA bundled, and the boot-time failure when CFP_WEB_DIST_PATH points at a missing directory. Per specs/architecture.md's "single Docker image bundles API + static web" claim and the deploy plan's readiness-probe + SPA-fallthrough requirements.

- Dockerfile: three stages (deps / build / runtime) on node:22.22-alpine. Final image is non-root (uid 1000), bundles git + ca-certificates + tini + openssh-client, ships apps/api/dist plus apps/web/dist for the single-image SPA-co-served deploy. - .dockerignore keeps secrets (.env, private-storage/, codeforphilly-data/) and dev-only artifacts (node_modules, dist, tests, plans, specs) out of the build context. - deploy/docker/entrypoint.sh handles the working-tree-on-startup pattern from specs/architecture.md: clone CFP_DATA_REMOTE on first boot, fetch + reset --hard on subsequent boots, then exec node. Uses GIT_SSH_COMMAND rendered by Helm when a deploy key Secret is mounted. Build: docker build -t cfp:dev . Smoke test: docker run --rm -p 3001:3001 \ -e CFP_DATA_REMOTE=https://github.com/CodeForPhilly/codeforphilly-data-snapshot.git \ -e STORAGE_BACKEND=filesystem \ -e CFP_PRIVATE_STORAGE_PATH=/app/private-storage \ -e CFP_JWT_SIGNING_KEY=$(openssl rand -base64 48) cfp:dev

Minimal chart at deploy/charts/codeforphilly/ following the layout from the plan: Deployment / Service / Ingress / PVC (data) / PVC (private, staging only) / ConfigMap / ServiceAccount. Architectural constraints baked in: - replicas: 1 + strategy.type: Recreate, both hard requirements per specs/architecture.md (in-process write mutex serializes mutations, concurrent old/new pods would corrupt the gitsheets working tree). - Liveness probe hits /api/health every 10s; readiness probe hits /api/health/ready every 5s — ingress doesn't route traffic until both stores have loaded. - Data-repo PVC mounted at CFP_DATA_REPO_PATH so the working tree survives pod restarts; the entrypoint refreshes from CFP_DATA_REMOTE on each boot anyway (PVC is an optimization, not the source of truth). - Secrets are never templated by the chart — values reference a caller-provided Secret (default name codeforphilly-secrets) via envFrom, and a separate Secret for the SSH deploy key. values.staging.yaml: filesystem private store + PVC, points at the public scrubbed-snapshot data remote so staging never serves real PII until the cutover-prep plan wires it up. values.production.yaml: S3 private store, real data remote with SSH deploy-key auth, larger resource budget, NODE_OPTIONS heap tuning. `helm lint` clean against all three values files.

- deploy-staging.yml: on push to main, builds the image (tagged sha-<short> + staging-latest), pushes to GHCR, and runs helm upgrade --install against namespace codeforphilly-staging. Gated by GitHub Environment "staging" — first run requires manual approval; secrets (KUBECONFIG_STAGING) are scoped per-environment. - deploy-production.yml: on push of tags matching v*.*.*, same build + helm upgrade against namespace codeforphilly. Gated by Environment "production". Also exposes workflow_dispatch with a tag input for promoting an already-built image. Both jobs use --atomic --wait --timeout 5m so a failed rollout auto-reverts. A post-deploy smoke check hits /api/health on the public ingress to catch ingress / cert misconfiguration before declaring the deploy successful. Action versions checked against upstream READMEs: - actions/checkout@v6 (matches existing ci.yml) - docker/setup-buildx-action@v3 - docker/login-action@v3 - docker/build-push-action@v6 - azure/setup-kubectl@v4 - azure/setup-helm@v4

Three new docs under docs/operations/, satisfying the deploy plan's "Operational docs" validation criterion: - deploy.md — implementation companion to specs/architecture.md's Deploy section. Image anatomy, boot sequence, Helm install/upgrade commands, bucket-provisioning checklist (R2 / B2 / S3 / MinIO options, with versioning + lifecycle rules + IAM scoping), environment-variable reference table. - secrets.md — inventory of every runtime secret with generation + rotation procedure: CFP_JWT_SIGNING_KEY, GITHUB_OAUTH_CLIENT_SECRET, S3_* keys, SAML key+cert, the data-repo SSH deploy key. Includes the bootstrap-a-new-environment recipe using sealed-secrets. - runbook.md — "API won't boot" playbook with log-grep table mapping common log lines to causes and fixes, plus rollback procedure.

themightychris added 9 commits May 16, 2026 19:01

chore(plans): mark deploy in-progress

7681a88

chore(asdf): pin helm 4.1.0

676caf0

Generated by: asdf set helm 4.1.0 Helm is used by deploy-staging.yml / deploy-production.yml workflows and for local `helm lint` / `helm template` validation against the chart under deploy/charts/codeforphilly.

docs(env): document CFP_WEB_DIST_PATH for production SPA serving

4d0ef28

themightychris mentioned this pull request May 16, 2026

deploy: stand up staging cluster + bucket and verify end-to-end #36

Open

7 tasks

chore(plans): mark deploy done (PR #35)

f0e8030

themightychris mentioned this pull request May 16, 2026

Wire startPushDaemon() in API boot so commits actually propagate to the data remote #37

Open

themightychris merged commit 387d06d into main May 16, 2026
1 check passed

themightychris deleted the feat/deploy branch May 16, 2026 23:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(deploy): docker + helm chart + CI/CD for staging#35

feat(deploy): docker + helm chart + CI/CD for staging#35
themightychris merged 10 commits into
mainfrom
feat/deploy

themightychris commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

themightychris commented May 16, 2026

Summary

Test plan

Verification before push

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant