Context
ADR-0006 established safe-settings on Cloud Run as the source of truth for GitHub configuration. The initial deployment was manual: pull the image from GHCR, tag for Artifact Registry, push manually, and gcloud run services update from the terminal.
This manual process had several problems:
- Not reproducible: the service state depended on who ran which command and when. There was no declarative state.
- No version tracking: the image on Cloud Run could be any version. There was no source of truth for which version was running.
- No update pipeline: updating the image required ~5 manual commands (pull, tag, push, update service, verify). A different operator could forget a step.
- Manual secrets: the secrets (APP_ID, PRIVATE_KEY, WEBHOOK_SECRET) were managed ad-hoc in GCP Secret Manager. No Terraform declared them.
Decision
We manage the full safe-settings deployment as Terraform on Cloud Run, in the iac-platform repo, with an automated image mirroring pattern from GHCR to Artifact Registry.
Architecture
iac-platform/safe-settings/
├── main.tf ← Cloud Run service, AR repo, secrets, IAM
├── mirror.tf ← Image mirroring workflow (GHCR → AR)
├── IMAGE_TAG ← Source of truth for the version (e.g., "2.1.20-rc.3")
├── variables.tf
├── outputs.tf
├── providers.tf
├── backend.tf
├── versions.tf
└── tfvars/
└── prd.tfvars ← deploy_service=true, project=eigenoid-prd
2-phase pattern
Phase 1 — Infrastructure (always active):
- Artifact Registry (Docker repo)
- Secret Manager (3 secrets: app-id, private-key, webhook-secret)
- Service Account for Cloud Run
- IAM bindings
Phase 2 — Service (activated by deploy_service = true):
- Cloud Run service with the AR image
- Environment variables (GH_ORG, ADMIN_REPO, LOG_LEVEL, CRON)
- Secret references (APP_ID, PRIVATE_KEY, WEBHOOK_SECRET via Secret Manager)
The phase separation allows creating the infrastructure first (Phase 1), populating secrets manually, and then activating the service (Phase 2) without a chicken-and-egg problem.
IMAGE_TAG as source of truth
The IMAGE_TAG file contains the exact safe-settings version:
2.1.20-rc.3
To update the version:
- Edit
IMAGE_TAGwith the new version. - Open a PR → Terraform plan shows the image change.
- Merge → mirroring workflow copies the image from GHCR to AR.
- Terraform apply updates the Cloud Run revision.
Image mirroring
Cloud Run cannot pull from GHCR. The mirroring workflow:
- Reads
IMAGE_TAGfrom the repo. - Pulls the image from
ghcr.io/github/safe-settings:{tag}(using digest to force amd64). - Tags and pushes to
europe-west1-docker.pkg.dev/eigenoid-prd/safe-settings-docker/safe-settings:{tag}. - Terraform references the AR image.
Current service data
| Field | Value |
|---|---|
| GCP Project | eigenoid-prd |
| Region | europe-west1 |
| Service name | eigenoid-safe-settings |
| Port | 3000 |
| Min instances | 1 (GitHub webhook timeout is 10s) |
| Max instances | 3 |
| IMAGE_TAG | 2.1.20-rc.3 |
| CRON | 0 0 */6 * * * (sync every 6h) |
Consequences
- Fully declarative state: all safe-settings infrastructure is Terraform.
terraform planshows exactly what would change. - Trivial rollback: revert
IMAGE_TAGto a previous version, merge, and Terraform deploys the old version. Cloud Run keeps the latest revisions accessible. - Integrated secret management: secrets are declared in Terraform (structure), populated once manually (value), and referenced by the service automatically. Rotation is: new version in Secret Manager → Terraform apply → new revision.
- Audit trail: every version or config change is a PR with a visible plan. No ad-hoc
gcloud run services update. - Mirroring complexity: the image must be copied from GHCR to AR. An extra step compared to direct deployment. Mitigated by: automated workflow.
- Production only: safe-settings has no staging environments. It runs only in
eigenoid-prdwithdeploy_service = true. This is intentional — safe-settings manages the GitHub org, which is a single entity. invoker_iam_disabled: the GCP org policy blocksallUsersIAM binding on Cloud Run. safe-settings validates webhooks via HMAC (WEBHOOK_SECRET), so IAM invocation is not required.
Alternatives considered
- Manual deploy (status quo): works for a single service but does not scale. No declarative state, no audit trail, no trivial rollback.
- Cloud Run source deploy: Google Cloud Buildpacks. Requires source code, not a pre-built image. safe-settings is distributed as an image, not as code.
- GKE / Cloud Run Jobs: overkill for a stateless service that receives webhooks. Cloud Run with min-instances=1 is sufficient and costs ~$5/month.
- Self-hosted runner with Docker Compose: possible but adds a server to maintain. Cloud Run is serverless.
References
- ADR-0006 — safe-settings for GitHub governance — decision to adopt safe-settings
- ADR-0008 — Shared IaC model with Terraform — producer/consumer model under which iac-platform operates
- Runbook: safe-settings operations — operational procedures
- Settings Bot — GitHub App technical sheet
- GitHub governance — Operational guide — full operational guide
eigenoid/iac-platform— Terraform repogithub/safe-settings— upstream project