Skip to main content

Context

ADR-0008 established Terraform as the tool for all shared infrastructure, including GitHub configuration (org settings, branch protection, repo defaults). However, as implementation progressed, GitHub-specific friction points emerged:

  • Real-time drift: Terraform only detects drift when terraform plan runs, typically in CI or manually. A manual change to a repo setting or ruleset goes unnoticed until the next plan. For GitHub, where any admin can change a setting with a single click, this leaves a significant drift window.
  • Slow feedback loop: changing a label or a topic requires editing HCL, opening a PR, waiting for CI, merging, and waiting for the apply. For GitHub configuration that is frequently iterative, this is unnecessarily heavyweight.
  • Org-level rulesets: the Terraform GitHub provider has limited support for organizational rulesets. safe-settings supports them natively with YAML.
  • Natural hierarchy: safe-settings has an inheritance model (org defaults → suborgs → repos) that maps directly to how we think about configuration. Replicating that hierarchy in Terraform requires complex modules and variables.

The GCP side of ADR-0008 does not have these problems: GCP resources do not change from an accidental click, the feedback loop is acceptable, and Terraform has full support for GCP.

Decision

We adopt github/safe-settings deployed on Cloud Run as the exclusive source of truth for all configuration of the eigenoid GitHub org. The scope of ADR-0008 is refined to cover only GCP infrastructure.

Implementation details:

  • Admin repo: eigenoid/platform-settings — contains all YAML config
  • Runtime: Cloud Run in europe-west1 (project eigenoid-prd), min-instances=1
  • GitHub App: "Eigenoid (Settings Bot)" (slug: eigenoid-settings-bot, App ID: 3424955)
  • Drift prevention: webhooks for rulesets and branch protection (real-time) + CRON sync every 6 hours (for everything else)
  • Image: ghcr.io/github/safe-settings mirrored to Artifact Registry (Cloud Run cannot pull from GHCR)
  • Secrets: GCP Secret Manager (APP_ID, PRIVATE_KEY, WEBHOOK_SECRET)

What safe-settings covers

  • Repo settings (private, merge strategy, wiki, projects, vulnerability alerts)
  • Labels (org-wide + per repo)
  • Teams and team → repo permissions
  • Org-level rulesets (modern branch protection)
  • Suborgs (shared config per group of repos)

What safe-settings does NOT cover (remains in Terraform / manual)

  • GCP resources (IAM, WIF, Artifact Registry, Secret Manager) → ADR-0008
  • GitHub Actions workflows → live in each repo
  • GitHub App creation and configuration → manual (one-shot)
  • Repo secrets → live in each repo or in Terraform
  • Org billing and plan → manual

Consequences

  • Real drift prevention for GitHub: manual changes to rulesets and branch protection are reverted within seconds. Other settings are corrected on the next CRON run (6 hours max).
  • Fast feedback loop: changing a label or a setting means editing YAML and pushing. safe-settings applies in ~10 seconds.
  • Natural hierarchical config: org defaults → suborgs → repos, without HCL boilerplate.
  • Dry-run on PRs: safe-settings runs a dry-run check when a PR is opened against the admin repo, showing what would change before merging.
  • Low operational cost: ~$5-10/month on Cloud Run.
  • New dependency: safe-settings container + GitHub App to maintain. Image mirroring is manual (for now).
  • Partial drift coverage: not all settings trigger real-time webhooks. The 6h CRON is the safety net.
  • Admin repo excluded from rulesets: platform-settings needs direct push to update config. Excluded from org-level rulesets to prevent lockout.
  • ADR-0008 scope refined: Terraform no longer manages GitHub configuration. The boundary is clear: GCP → Terraform, GitHub → safe-settings.

Alternatives considered

  • Terraform GitHub provider (ADR-0008 status quo): full support for repos and teams, but limited for org-level rulesets. No real-time drift detection. Slow feedback loop for frequent changes. Good for GCP, not ideal for GitHub.
  • probot/settings: predecessor of safe-settings, without rulesets, drift prevention, or suborgs. Previously installed, uninstalled in favor of safe-settings.
  • Manual click-ops: simple, does not scale, not auditable, not reproducible. Unacceptable for an org with multiple repos and contributors.
  • GitHub Actions workflow with gh api: imperative, stateless, no drift detection. Fine for one-shot scripts, poor for declarative configuration.

References