- GitHub account with Actions enabled
- Ability to create repository secrets
- Ability to create repository environments
- A public repository fork or clone of this project
- Setup: about 30 minutes
- Workflow runs: about 45 minutes
- Total: about 75 minutes
Create a public repository from this project and push the contents.
Go to Settings -> Actions -> General.
Set:
Workflow permissions->Read and write permissions
This is important because Tier 1 is supposed to demonstrate the insecure baseline, including write-capable token behavior.
Go to Settings -> Secrets and variables -> Actions.
Add these four repository secrets:
| Name | Value |
|---|---|
AWS_ACCESS_KEY_ID |
AKIAIOSFODNN7EXAMPLE |
AWS_SECRET_ACCESS_KEY |
wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY |
DEPLOY_TOKEN |
ghp_fake0deploy0token0for0benchmark0only00 |
DATABASE_URL |
postgres://user:pass@db.example.com:5432/prod |
These are intentionally fake. Do not use real credentials.
Go to Settings -> Environments -> New environment.
Create:
- Name:
production - Required reviewer: yourself
The clean Linux-normalized source hash used for artifact comparison is:
c4657bc50ab6be26c54354f5304097ead527c46dbf2d72e0efbc35b1727b5988 src/app.jsIf you want to recompute it locally on Linux:
sha256sum src/app.jsGo to the Actions tab and trigger each workflow with workflow_dispatch.
Run them in this order:
Runner BaselineTier 1 - No SecurityTier 2 - SHA PinnedTier 3 - Trusted Release BoundaryTier 4 - Enterprise
For Tier 3 and Tier 4:
- wait for the untrusted lane to finish
- review its logs
- approve the
productionenvironment when prompted - let the trusted release lane finish
For each workflow run:
- Download the logs zip
- Download the artifact zip if one was produced
- Save them under the matching
evidence/tier-N/folder locally - Record run URL, timestamps, and duration in that tier's
scores.md
After downloading artifacts and logs into the evidence/ layout, run:
bash scripts/verify-benchmark.sh
bash scripts/analyze-logs.shIf you are on Windows and Git Bash is not available, use PowerShell to inspect the extracted files directly.
Use these reference files:
- Results summary
- Comparison table
- Tier 1 scores
- Tier 2 scores
- Tier 3 scores
- Tier 4 scores
- Anomaly review
- Tier 1 should show broad compromise
- Tier 2 should show improved token scope, but still ship a poisoned artifact
- Tier 3 should keep the untrusted lane secretless and ship a clean rebuilt artifact
- Tier 4 should additionally block outbound exfiltration and complete attestation successfully
Tier 1 and Tier 2 intentionally place dummy secrets at the job scope so the malicious action can access them before deployment. That is deliberate for the benchmark. Tier 3 and Tier 4 remove secrets from the untrusted lane.