Skip to content

percona/percona-cd-platform

Repository files navigation

percona-cd-platform

Percona's CI/CD platform: a GitOps-managed EKS cluster in us-east-1 hosting the Jenkins masters and the platform services around them (LGTM observability, Authentik SSO, ingress with TLS, autoscaling). Everything is defined as code and reconciled from this repo. There are no manual cluster changes.

Key facts

  • OpenTofu owns AWS up to "ArgoCD healthy": VPC, EKS, node groups, Pod Identity, the EC2 Jenkins masters, ARM spot fleets, S3, cleanup reapers. TF outputs reach ArgoCD as cluster-Secret annotations.
  • From there ArgoCD owns everything in-cluster: a root App-of-Apps fans out ApplicationSets that reconcile one Application per resources/addons/* dir and one per resources/jenkins/master/instances/* dir. No manual kubectl.
  • Jenkins masters serve on *.cd.percona.com in two modes: EKS-fronted EC2 (ALB, in-cluster NGINX, cross-region VPC peering, an EndpointSlice reconciler) or in-cluster StatefulSet. Hostnames resolve to the ALB (HTTPS only). A shell goes through SSM (runbook).
  • Repo CI is lint + validate only. ci-gate is the single required check and just ci mirrors it locally.
  • The repo is public: no account IDs, ARNs, or secrets in committed files.

Layout

Path What lives there
terraform/ AWS substrate. Conventions in terraform/CLAUDE.md (file-naming grammar, per-team # Owner: banners, tags). Reusable modules carry their own READMEs (jenkins-arm-fleet, jenkins-arm-standalone, scheduled-lambda). Pins in versions.tf
argocd-bootstrap/ Root Application, ApplicationSets, AppProject
resources/addons/ One dir = one ArgoCD Application (observability, ingress, SSO, ...)
resources/jenkins/ In-cluster master chart, per-instance values, clouds catalog (rendered by scripts/render-clouds.py, drift-gated in CI)
images/ Container images (controller bundle and friends), built by GitHub Actions
scripts/ Verification and render tooling. Catalog in scripts/README.md
docs/ Architecture, ADRs, runbooks. Everything is indexed in docs/README.md
justfile The single entrypoint for CI and every tofu operation

Quickstart

just ci                # local lint + validate (mirrors the PR gate)
just tf-plan           # TF plan (writes tfplan)
just tf-apply          # apply the saved tfplan, never auto-approve
just ssh               # list the running Jenkins masters (just ssh <inst> opens a shell)

AWS_PROFILE must be exported in your shell. AWS-touching recipes fail loudly without it. Back up state before risky applies (just tf-state-backup). State bucket bootstrap: runbook.

Tool requirements

Tool Used for
just The single entrypoint. Every workflow below is a recipe
OpenTofu (tofu) All terraform operations (version pin at the top of the justfile)
AWS CLI v2 Every AWS-touching recipe. SSO login via aws sso login
session-manager-plugin Interactive just ssh <inst> sessions (one-shot just ssm-run works without it)
kubectl Cluster access (just kubeconfig), ps3 shell
uv Python script gates and lambda tests inside just ci
Docker (buildx) just build-image only
trivy, yamllint, actionlint, zizmor, kubeconform The just ci lint set. Version pins sit at the top of the justfile (helm is fetched and sha-verified automatically)

Where the details are

Topic Doc
System architecture and components docs/architecture.md
Compute tiers, MNG vs Karpenter reasoning ADR 0017
Observability push pipeline docs/observability.md
EC2 master connectivity and resilience docs/connectivity.md, docs/ec2-master-resilience.md
Shell access to the masters (just ssh, SSM) docs/runbooks/master-shell-access.md
Account cleanup reapers docs/runbooks/cleanup-reapers.md
Bootstrap, recovery, upgrades docs/runbooks/
Every past design decision docs/adr/

Contributing

License

GNU Affero General Public License v3.0, see LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors