Skip to content

Commit

Permalink
Cleanup documentation
Browse files Browse the repository at this point in the history
Signed-off-by: Rick Lane <[email protected]>
  • Loading branch information
rick-a-lane-ii committed Jan 31, 2024
1 parent bcd3035 commit 9ad7dac
Show file tree
Hide file tree
Showing 8 changed files with 88 additions and 184 deletions.
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
# Infra-actions documentation
# Infra-actions

## GitHub
- [Using the GitHub actions matrix strategy](./docs/GITHUB_ACTIONS.md)
- [Self-hosted GitHub action runners](./github-runner-provisioner/README.md)
## Clusters
- [Cluster provisioning with custom manifests](./setup-cluster/README.md)
## Dev loop
- [DEVELOPING.md](docs/DEVELOPING.md)

- [Github Actions for Test Matrices](docs/GITHUB_ACTIONS.md)
- [Custom GitHub action runners](docs/ACTION_RUNNERS.md)
- [Self-hosted GitHub action runners](github-runner-provisioner/README.md)

## Development

- [Working with GitHub workflows and actions](docs/DEVELOPING.md)
- [Provision Cluster GitHub Action](provision-cluster/README.md)
35 changes: 14 additions & 21 deletions docs/ACTION_RUNNERS.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,20 @@
# Custom GitHub action runners

There are self-hosted Mac M1 and Ubuntu ARM64 runners available for GitHub actions. There runners are EC2 instances
hosted in AWS.
There are self-hosted Mac M1 and Ubuntu ARM64 runners available for GitHub actions. There runners are EC2 instances hosted in AWS.

In the future, we may make additional runners available depending on the needs of the different teams.

## Repository configuration

Before a job can use a self-hosted runner, the following settings need to be configured in the GitHub repository:

1. Add the `d6e-automaton` account as a repo administrator (`Repo ⇾ Settings ⇾ Collaborators and teams`)
2. Add a webhook (`Repo ⇾ Settings ⇾ Webhooks`) with the following settings:
1. Payload URL: `https://sw.bakerstreet.io/github-runner-provisioner/`
2. Content type: `application/x-www-form-urlencoded`
3. Secret: Enter the value found in `/keybase/team/datawireio/secrets/github-actions/github-infra-actions`
4. SSL verification: `Enable`
5. Which events trigger the webhook? `Let me select individual events` `Workflow jobs`
5. Which events trigger the webhook? `Let me select individual events``Workflow jobs`

Once the webhook is configured, you can use the runners as described below.

Expand All @@ -24,39 +23,33 @@ Once the webhook is configured, you can use the runners as described below.
There are self-hosted Mac M1 (ARM64) runners that can be used in a workflow by using `runs-on: macOS-arm64`.

```yaml
...
jobs:
my_job:
runs-on: macOS-arm64
steps:
# The provision-cluster action will automatically register a cleanup hook to remove the
# cluster it provisions when the job is done.
- uses: actions/checkout@v3
...
my_job:
runs-on: macOS-arm64
steps:
# The provision-cluster action will automatically register a cleanup hook to remove the
# cluster it provisions when the job is done.
- uses: actions/checkout@v3
```
The following limitations apply to Mac M1 runners:
- It will take between 30 minutes and up to 3 hours for a runner to be available from the moment it is requested by a job.
- There is a limit of 10 active Mac M1 runners. Any build that requests a Mac M1 during this time will
stay in a queued state until a runner is available. If a job is queued for more than 24 hours, it will be marked as failed.
- Once a Mac M1 runner is created, it will continue to run for up to 24 hours, picking-up oe or more jobs. What the means
is that jobs are responsible for ensuring that runners are in a clean state before they are used.
- There is a limit of 10 active Mac M1 runners. Any build that requests a Mac M1 during this time will stay in a queued state until a runner is available. If a job is queued for more than 24 hours, it will be marked as failed.
- Once a Mac M1 runner is created, it will continue to run for up to 24 hours, picking-up oe or more jobs. What the means is that jobs are responsible for ensuring that runners are in a clean state before they are used.
## Ubuntu ARM64 runners
These self-hosted runners are created on-demand. It takes about a minute for the runner to be available, and once the
job finishes, they are destroyed.
These self-hosted runners are created on-demand. It takes about a minute for the runner to be available, and once the job finishes, they are destroyed.
To request one, use label `ubuntu-arm64`:

```yaml
...
jobs:
my_job:
runs-on: ubuntu-arm64
steps:
# The provision-cluster action will automatically register a cleanup hook to remove the
# cluster it provisions when the job is done.
- uses: actions/checkout@v3
...
```
```
7 changes: 3 additions & 4 deletions docs/DEVELOPING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,12 @@ GitHub workflows and any actions used by them can be tested locally using [act](

Once `act` is installed, it can be invoked from the repository root like this:

```
```shell
act pull_request
```

`act` can pass secrets with the command line option `-s`. For example, to pass a secret called `KUBECEPTION_TOKEN` run it
like this:
`act` can pass secrets with the command line option `-s`. For example, to pass a secret called `KUBECEPTION_TOKEN` run it like this:

```
```shell
act pull_request -s KUBECEPTION_TOKEN=MY_TOKEN
```
49 changes: 18 additions & 31 deletions docs/GITHUB_ACTIONS.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,36 @@
# Github Actions for Test Matrices

This repository hosts github actions that can be used to provision and configure kubernetes
clusters. These are intended to facilitate building out a comprehensive [test
matrix](../.github/workflows/matrix.yaml) suitable for use in real-world large scale integration and
compatibility testing for both telepresence and edge-stack.
This repository hosts github actions that can be used to provision and configure kubernetes clusters. These are intended to facilitate building out a comprehensive [test matrix](../.github/workflows/matrix.yaml) suitable for use in real-world large scale integration and compatibility testing for both Telepresence and Edge Stack.

The [matrix workflow](../.github/workflows/matrix.yaml) illustrates an exemplary usage of these
actions.
The [matrix workflow](../.github/workflows/matrix.yaml) illustrates usage of these actions.

## Cluster Provisioning

The [provision-cluster](../provision-cluster/README.md) action can be used to provision different
varieties of clusters:
The [provision-cluster](../provision-cluster/README.md) action can be used to provision different varieties of clusters:

- Kubeception (k3s based)
- GKE
- EKS (unimplemented)
- AKS (unimplemented)

By including this github action in your workflow you can easily run the same test suite against any
supported set of clusters:
By including this github action in your workflow you can easily run the same test suite against any supported set of clusters:

```yaml
...
jobs:
...
my_matrix_job:
strategy:
matrix:
clusters:
- distribution: GKE
version: "1.23"
useAuthProvider: "false"
- distribution: GKE
version: "1.23"
useAuthProvider: "true"
- distribution: AKS
version: "1.22"
- distribution: Kubeception
version: "1.23"
- distribution: GKE
version: "1.23"
useAuthProvider: "false"
- distribution: GKE
version: "1.23"
useAuthProvider: "true"
- distribution: AKS
version: "1.22"
- distribution: Kubeception
version: "1.23"
steps:
# The provision-cluster action will automatically register a cleanup hook to remove the
# cluster it provisions when the job is done.
Expand All @@ -54,19 +46,14 @@ jobs:

useAuthProvider: ${{ matrix.clusters.useAuthProvider }}
- run: make tests
...
```
The following inputs apply only to GKE clusters:
`useAuthProvider`: If set to "true", Authentication is done using an authentication provider, like the
[gke-gcloud-auth-plugin](https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke).

- `useAuthProvider`: If set to "true", Authentication is done using an authentication provider, like the [gke-gcloud-auth-plugin](https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke).

The action returns the following outputs:

`clusterName`: Name of the cluster.

`projectId`: For GKE, the project ID. Undefined for other cluster providers.

`location`: For GKE, the cluster location (region or zone). Undefined for other cluster providers.
- `clusterName`: Name of the cluster.
- `projectId`: For GKE, the project ID. Undefined for other cluster providers.
- `location`: For GKE, the cluster location (region or zone). Undefined for other cluster providers.
75 changes: 30 additions & 45 deletions github-runner-provisioner/README.md
Original file line number Diff line number Diff line change
@@ -1,71 +1,56 @@
# Runner Service
# Self-hosted GitHub action runners

This service is based on the [echo
template](https://github.com/datawire/infrastructure/tree/master/echo). Please view the
[README](https://github.com/datawire/infrastructure/tree/master/echo) for details about the dev loop
and how it works.
This service is based on the [echo template](https://github.com/datawire/infrastructure/tree/master/echo). Please view the [README](https://github.com/datawire/infrastructure/tree/master/echo) for details about the dev loop and how it works.

# Architecture
## Architecture

We use the GitHub-Runner-Provisioner to serve a webhook to GitHub Actions. GitHub will send any
Actions events to the GRP running in Skunkworks, which will parse those events looking for
workflows that request special labels in their `runs-on` property.
We use the GitHub-Runner-Provisioner to serve a webhook to GitHub Actions. GitHub will send any Actions events to the GRP running in Skunkworks, which will parse those events looking for workflows that request special labels in their `runs-on` property.

Using the GitHub Self-Hosted Runner binaries we then spin up the custom runners in one of our
supported runner providers - currently AWS and CodeMagic. Supported runners are configured in
[runner.go](runner.go).
Using the GitHub Self-Hosted Runner binaries we then spin up the custom runners in one of our supported runner providers - currently AWS and CodeMagic. Supported runners are configured in [runner.go](runner.go).

## AWS
### AWS

AWS runners are created in EC2 using the AWS SDK. See the [aws_runners](internal/aws/runners)
package for details on the implementation.
AWS runners are created in EC2 using the AWS SDK. See the [aws_runners](internal/aws/runners) package for details on the implementation.

## CodeMagic
### CodeMagic

CodeMagic runners are actually CodeMagic Builds (CI jobs in their service) that then pull the
GitHub Self-Hosted binaries and register themselves as ephemeral (single-use) runners - picking
up a single job from the calling repo and then terminating.
CodeMagic runners are actually CodeMagic Builds (CI jobs in their service) that then pull the GitHub Self-Hosted binaries and register themselves as ephemeral (single-use) runners - picking up a single job from the calling repo and then terminating.

# Testing the application
## Testing the application

## Integration Tests
### Integration Tests

**Note**: Before running tests, make sure you run the application with environment variable `WEBHOOK_TOKEN=FAKE_TOKEN`.
You will also need to set `GITHUB_TOKEN` to a PAT for the D6E Automaton. These values can all be found in the
[github-runner-provisioner-secrets.yaml](/keybase/team/datawireio/skunkworks/github-runner-provisioner-secrets.yaml)
file in Keybase - you will need to base64 decode them before use. If only running dry-runs only AWS and GitHub
authentication is required.

To test the application we use targets in the Makefile. The `make go-unit-tests` target will run the unit tests,
and `make test-runners` will run the integration tests against the dry-run endpoints. Note that to test the
AWS `macOS-arm64` runner you will need to set the `USE_CODEMAGIC` environment variable to `true` in the GRP.
You will also need to set `GITHUB_TOKEN` to a PAT for the D6E Automaton. These values can all be found in the [github-runner-provisioner-secrets.yaml](/keybase/team/datawireio/skunkworks/github-runner-provisioner-secrets.yaml) file in Keybase - you will need to base64 decode them before use. If only running dry-runs only AWS and GitHub authentication is required.

To test the application we use targets in the Makefile. The `make go-unit-tests` target will run the unit tests, and `make test-runners` will run the integration tests against the dry-run endpoints. Note that to test the AWS `macOS-arm64` runner you will need to set the `USE_CODEMAGIC` environment variable to `true` in the GRP.

Testing CodeMagic M1 & AWS ubuntu-arm64:

Testing CodeMagic M1 & AWS ubuntu-arm64:
```bash
USE_CODEMAGIC=true GITHUB_TOKEN=<pat> go run main.go --dry-run
make test-runners
USE_CODEMAGIC=true GITHUB_TOKEN=<pat> go run main.go --dry-run
make test-runners
```

**Note**: You can send requests to the production client using `make run-<runner tag>` Be careful when sending
requests to production using an HTTP client, since the `dry-run`
request parameter defaults to true. This is necessary because we have no way to set GitHub to send this
parameter.
**Note**: You can send requests to the production client using `make run-<runner tag>` Be careful when sending requests to production using an HTTP client, since the `dry-run` request parameter defaults to true. This is necessary because we have no way to set GitHub to send this parameter.

## Unit tests
### Unit tests

Some unit tests use mocks generated by gomock. If the interface being mocked is updated, you may have to re-generate the
mocks by running:
Some unit tests use mocks generated by gomock. If the interface being mocked is updated, you may have to re-generate the mocks by running:

```shell
make update-go-mocks
make update-go-mocks
```

# Env Vars
## Env Vars

The runner provisioner requires the following variables to be configured:
- `GITHUB_TOKEN` - a personal access token with admin access to the repo configuring the runners.
We use the `D6E-Automaton`'s token in production.
- `WEBHOOK_TOKEN` - the secret used to configure the webhook in GitHub. We use the token stored at
`/Keybase/team/datawireio/infra/github-runner-provisioner-secrets`

- `GITHUB_TOKEN` - a personal access token with admin access to the repo configuring the runners.
We use the `D6E-Automaton`'s token in production.
- `WEBHOOK_TOKEN` - the secret used to configure the webhook in GitHub. We use the token stored at
`/Keybase/team/datawireio/infra/github-runner-provisioner-secrets`
- `CODEMAGIC_TOKEN` - the secret used to authenticate to the CodeMagic build API to trigger M1 runners
- `USE_CODEMAGIC` - a boolean flag to indicate whether to use CodeMagic or AWS to provision M1 runners
- AWS auth can be configured with `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` or by using the aws cli
- AWS auth can be configured with `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` or by using the aws cli
29 changes: 13 additions & 16 deletions provision-cluster/README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,27 @@
# Documentation to enable developing and releasing the items in this repository.
# Provision Cluster GitHub Action

## Releasing the provision-cluster GitHub Action:
## Releasing the provision-cluster GitHub Action

GitHub Actions are released by creating a semver tag and pushing it to GitHub. No additional steps
are needed.
GitHub Actions are released by creating a semver tag and pushing it to GitHub. No additional steps are needed.

### Step 1: Query existing tags

Use `git pull` to make sure you have all tags locally and then use `git tag -l` to find existing tag
names. Release tags are of the form `vX.Y.Z` and release versions should follow semver.
Use `git pull` to make sure you have all tags locally and then use `git tag -l` to find existing tag names. Release tags are of the form `vX.Y.Z` and release versions should follow semver.

### Step 2: Tag with your new version number

Use `git tag vX.Y.Z` to tag with your new version number, and then run `git push --tags` to push the
new tag up to GitHub.
Use `git tag vX.Y.Z` to tag with your new version number, and then run `git push --tags` to push the new tag up to GitHub.

### Step 3: Verify the release works by updating the smoke test workflow.
### Step 3: Verify the release works by updating the smoke test workflow

Once the tag is pushed, then verify the release by using it in the smoke test workflow. Do this by
editing `.github/workflows/smoke.yaml`, search for the uses line and update the version to the newly
released tag, e.g.:
Once the tag is pushed, then verify the release by using it in the smoke test workflow. Do this by editing `.github/workflows/smoke.yaml`, search for the uses line and update the version to the newly released tag.

```
...
- uses: datawire/infra-actions/[email protected]
...
```yaml
jobs:
release_smoke:
steps:
- id: provision
uses: datawire/infra-actions/[email protected]
```
Pushing the tag should trigger the release smoke test workflow. Verify that this has in fact passed.
8 changes: 0 additions & 8 deletions scripts/README.md

This file was deleted.

Loading

0 comments on commit 9ad7dac

Please sign in to comment.