This guide covers everything you need to know to contribute to and develop the Buildkite Agent Stack for Kubernetes controller codebase.
Note that our development approach emphasizes testability, reliability, and maintainability. Contributors should focus on writing clean, well-tested code that follows established Go patterns and practices.
When contributing:
- Create feature branches from
main. - Add appropriate tests for new functionality.
- Ensure all tests pass locally before submitting pull requests.
- Follow Go code style conventions.
- Include detailed commit messages explaining changes.
The integration test suite is crucial to our development workflow - it verifies that changes maintain compatibility with both Buildkite and Kubernetes APIs.
To start developing for the Buildkite Agent Stack for Kubernetes controller, you'll need to install dependencies with Homebrew via:
brew bundleRun tasks via just:
just --listThe Buildkite Agent Stack for Kubernetes controller integration tests depend on a running Buildkite instance. By default, they use the production version of Buildkite.
flowchart LR
c((Controller)) -->|create jobs| K
Buildkite <-->|Pull jobs| c
subgraph K8s cluster
K(Kube API)
end
During a test run, each integration test generally performs these steps:
- Create ephemeral pipelines and queues for a given Buildkite Agent Cluster.
- Runs the controller, which will monitor jobs from the (just-created) queue in the Buildkite Cluster and start new Jobs in the Kubernetes cluster.
- Starts a build of the pipeline on Buildkite, which causes Buildkite jobs to become available.
- Polls Buildkite while waiting for the expected outcome, which may include build success, build failure, and the presence or absence of certain log messages.
- Cleans up those ephemeral objects (pipelines and queues).
Any Buildkite user who has an access to a Kubernetes cluster should be able to run our integration test.
To get the integration test running locally, you will need:
- A valid Buildkite API token with GraphQL enabled. (This is only used by integration test)
- A valid Buildkite Agent Token in your target Buildkite Cluster.
- Depending on test cases, you may also need SSH keys - see below.
- Your shell environment will need CLI write access to a Kubernetes cluster such as the one provided by https://orbstack.dev/.
It's generally convenient to supply the API token as an environment variable. This can be done using an .envrc file loaded by using direnv.
export BUILDKITE_TOKEN="bkua_**************"Then check your k8s permissions by running:
just check-k8s-api-accessLastly provide the agent token, the Buildkite Agent token is used by the controller and by the kubernetes jobs:
kubectl create secret generic buildkite-agent-token --from-literal=BUILDKITE_AGENT_TOKEN=$YOUR_CLUSTER_AGENT_TOKENTo run integration test locally, we recommend you to run individual tests via -run. For example,
just test -v -run TestWalkingSkeletonThe -v will ensure log being visible.
To run all integration tests, with the overrides from your environment, you can use the following command:
just test -v ./internal/integration/... -args --buildkite-token $BUILDKITE_TOKENNOTE: various integration tests have special requirements, such as needing extra secrets like SSH key etc. To avoid unnecessary complexity, we recommend you to run individual tests on demand locally.
To run the controller locally, you need to follow the Local setup guide in the integration guide above.
And then run the following example.
just runOr if you want the local controller to poll jobs from a partituclar queue.
just run --tags 'queue=some-queue'Running all the unit tests locally is done as follows:
go test -v -cover `go list ./... | grep -v internal/integration`Required Buildkite API token scopes:
read_clustersread_artifactsread_buildsread_build_logswrite_pipelineswrite_clusters
You'll need to create an SSH secret in your cluster to run this test pipeline. This SSH key needs to be associated with your GitHub account to be able to clone this public repo, and must be in a form acceptable to OpenSSH (aka BEGIN OPENSSH PRIVATE KEY, not BEGIN PRIVATE KEY).
kubectl create secret generic integration-test-ssh-key --from-file=SSH_PRIVATE_RSA_KEY=$HOME/.ssh/id_githubThe integration tests on the kubernetes-agent-stack pipeline will create additional pipelines in the buildkite-kubernetes-stack organization.
If the Buildkite agent token is allowing jobs to be picked up, and each job continuously fails with a HTTP 422 error, the most likely cause here is that the stored agent token is invalid. To confirm this, validate that the token value is indeed provided as expected:
kubectl get secret buildkite-agent-token -o jsonpath='{.data.BUILDKITE_AGENT_TOKEN}' \
| base64 -d \
| xxdDifferent shells behave differently so if a newline is being added to the value before it is being encoded, using the following could be helpful:
echo -n ${BUILDKITE_AGENT_TOKEN} | base64
kubectl edit secret buildkite-agent-tokenThe edit secret command will open $EDITOR with the spec of the secret. The output from the
previous command can be copied into the spec as the new value for the secret.
In general, for successful tests, pipelines will be deleted automatically. However, for unsuccessful tests, they will remain after the end of the test job to allow you to debug them.
To clean them up, run:
just cleanup-orphansFor this to work, you will need a Buildkite API token with GraphQL enabled and the following REST API scopes also enabled:
read_artifactswrite_pipelines
This is usually enough, but there is another situation where the cluster could be clogged with Kubernetes jobs. To clean these out, you should run the following in a Kubernetes context in the namespace containing the controller used to run the CI pipeline.
kubectl get -o jsonpath='{.items[*].metadata.name}' jobs | xargs -L1 kubectl delete jobAt the time of writing, the CI pipeline run in an EKS cluster, agent-stack-k8s-ci in the buildkite-dist AWS account.
CI deploys the controller onto buildkite namespace in that cluster.
just deploy will build the container image using ko and
deploy it with Helm.
You'll need to have set KO_DOCKER_REPO to a repository you have push access
to. For development, something like the kind local
registry or the minikube
registry can be used. More
information is available at ko's
website.
You'll also need to provide the required configuration values to Helm, which can be done by passing extra args to just:
just deploy --values config.yamlWith config.yaml being a file containing required Helm values, such as:
agentToken: "abcdef"
graphqlToken: "12345"The config key contains configuration passed directly to the binary, and so supports all the keys documented in the example.
-
Make sure you're on the main branch!
-
Create a tag
git tag -sm v0.x.x v0.x.x
-
Push your tag
git push --tags
-
A build will start at https://buildkite.com/buildkite-kubernetes-stack/kubernetes-agent-stack/builds?branch=v0.x.x. It will create a draft release with a changelog. Edit the changelog to group the PRs in to sections like
# Added # Fixed # Changed # Security # Internal
-
Publish the release 🎉