For general information about our implementation of Github Advanced Security you can see our documentation in Confluence here.
The purpose of this tool is to help enable GitHub Advanced Security (GHAS) across multiple repositories in an automated way. There will be times when you need the ability to enable Code Scanning (CodeQL), Secret Scanning, Dependabot Alerts, and/or Dependabot Security Updates across various repositories, and you don't want to click buttons manually or drop a GitHub Workflow for CodeQL into every repository. Doing this is manual and painstaking. The purpose of this utility is to help automate these manual tasks.
The primary motivator for this utility is CodeQL. It is incredibly time-consuming to enable CodeQL across multiple repositories. Additionally, no API allows write access to the .github/workflow/ directory. So this means teams have to write various scripts with varying results. This tool provides a tried and proven way of doing that.
Secret Scanning & Dependabot is also hard to enable if you only want to enable it on specific repositories versus everything. This tool allows you to do that easily.
This implementation makes use of CodeQL, Terraform, and PowerShell scanning actions:
- CodeQL Scan Action
- CodeQL scanning comes via CodeQL Engine
- Terraform Scan Action
- Terraform scanning comes via tfsec.
- See Azure Checks for more information.
- PowerShell Scan Action
- PowerShell scanning comes via PSScriptAnalyzer.
- See PowerShell Rules for more information.
There are two main actions this tool does:
Part One:
Goes and collects repositories that will have Code Scanning (CodeQL)/Secret Scanning/Dependabot Alerts/Dependabot Security Updates enabled. There are three main ways these repositories are collected.
- Collect the repositories where the primary language matches a specific value. For example, if you provide JavaScript, all repositories will be collected where the primary language is, Javascript.
- Collect the repositories to which a user has administrative access, or a GitHub App has access.
If you select option 1, the script will return all repositories in the language you specify (which you have access to). The repositories collected from this script are then stored within a repos.json file. If you specify option 2, the script will return all repositories you are an administrator over. The third option is to define the repos.json manually. We don't recommend this, but it's possible. If you want to go down this path, first run one of the above options for collecting repository information automatically, look at the structure, and build your fine of the laid out format.
Part Two:
Loops over the repositories found within the repos.json file and enables Code Scanning(CodeQL)/Secret Scanning/Dependabot Alerts/Dependabot Security Updates/Secret Scanning Push Protection.
If you pick Code Scanning:
- Loops over the repositories found within the
repos.jsonfile. A pull request gets created on that repository with thecode-analysis.ymlfound in thebin/workflowsdirectory. For convenience, all pull requests made will be stored within theprs.txtfile, where you can see and manually review the pull requests after the script has run.
If you pick Secret Scanning:
- Loops over the repositories found within the
repos.jsonfile. Secret Scanning is then enabled on these repositories.
If you pick Dependabot Alerts:
- Loops over the repositories found within the
repos.jsonfile. Dependabot Alerts is then enabled on these repositories.
If you pick Dependabot Security Updates:
- Loops over the repositories found within the
repos.jsonfile. Dependabot Security Updates is then enabled on these repositories.
- Node v18 or higher installed.
- Yarn*
- TypeScript
- Git installed on the (user's) machine running this tool.
- Python If using first option in Step 1
- A Personal Access Token (PAT) that has at least admin access over the repositories they want to enable Code Scanning on.
- Some basic software development skills, e.g., can navigate their way around a terminal or command prompt.
- You can use
npmbut for the sake of thisREADME.md; we are going to standardise the commands on yarn. These are easily replaceable though withnpmcommands.
-
Clone this repository onto your local machine.
git clone [email protected]:im-open/ghas-enablement.git
-
Change the directory to the repository you have just installed.
cd ghas-enablement -
Generate your chosen Personal Access Token (PAT). The GitHub App needs to have permissions of
read and writeofadministration,Code scanning alerts,contents,issues,pull requests,workflows. The GitHub PAT needs access torepo,workflowandread:orgonly. (if you are runningyarn run getOrgsyou will also need theread:enterprisescope). -
Copy the
.env.sampleto.env. On a Mac, this can be done via the following terminal command:cp .env.sample .env
-
Update the
.envwith the required values. Please pick one of the authentication methods for interacting with GitHub. You can either fill in theGITHUB_API_TOKENwith a PAT that has access to the Org. OR, fill in all the values required for a GitHub App. Note: It is recommended to pick the GitHub App choice if running on thousands of repositories, as this gives you more API requests versus a PAT. -
Update the
GITHUB_ORGvalue found within the.env. Remove theXXXXand replace that with the name of the GitHub organization you would like to use as part of this script. -
Update the
LANGUAGE_TO_CHECKvalue found within the.env. Remove theXXXXand replace that with the language you would like to use as a filter when collecting repositories. Note: Please make sure these are lowercase values, such as:javascript,python,go,ruby,hcl,powershell, etc. -
Update the
ITHD_TICKET_URLvalue with the url for the ITHD ticket that can be created by teams when they are having trouble getting the Github Advanced Security workflow running successfully. The URL should be to the ticket type that is sent to the Purple Team. -
Decide what you want to enable. Update the
ENABLE_ONvalue to choose what you want to enable on the repositories found within therepos.json. This can be one or multiple values. If you are enabling just code scanning (CodeQL) you will need to setENABLE_ON=codescanning, if you are enabling everything, you will need to setENABLE_ON=codescanning,secretscanning,pushprotection,dependabot,dependabotupdates. You can pick one, two or three. The format is a comma-seperated list. -
OPTIONAL: Update the
CREATE_ISSUEvalue totrue/falsedepending on if you would like to create an issue explaining the purpose of the PR. We recommend this, as it will help explain why the PR was created; and give some context. However, this is optional. The text which is in the issue can be modified and found here:./src/utils/text/. -
OPTIONAL: If you would like the Pull Request, for Code Scanning, to be created as a Draft add
CREATE_DRAFT_PRand set it totrue. Otherwise the Pull Request will be set asReady for review. -
OPTIONAL: The title to give to the Code Scanning Pull Request. If this is empty
Github Advanced Security - Code Scanningwill be used. -
If you are enabling Code Scanning (CodeQL), check the
code-analysis.ymlfile. This is a sample file; please configure this file to suit your repositories needs. -
Run
yarn installornpm install, which will install the necessary dependencies. -
Run
yarn run buildornpm run build, which will create the JavaScript bundle from TypeScript.
There are two simple steps to run:
The first step is collecting the repositories you would like to run this script on. You have four options as mentioned above:
- Option 1 is automated and finds all repositories in all organizations
- Option 2 is automated and finds all the repositories within an organization you have admin access to.
- Option 3 is automated and finds all the repositories within an organization based on the language you specify.
- Option 4, which is a manual entry of the repositories you would like to run this script on. See more information below.
This options supports enabling GHAS in batches. New code, which has been written in python, has been written to handle this. A new property in the .env file has been created called BATCH_ORGS. This is a comma-separated list of github organization names (no spaces should exist in between commas). This option will asynchronously lookup orgs, repos and their languages, and whether they already have the code scanning workflow this tool creates.
Note: In order to save API calls results are saved in
repo-results/YYYY-MM-DDdirectory.
How to Run the batching functionality to create repos.json files:
- Set
REPOS_PER_BATCHvalue in.envfile. - Run the debug configuration
Python: Mainand select option 3REPOS_BATCHERby typing3and pressingENTER- The batched repos.json files are saved to
repo-results/YYYY-MM-DD/batches
- The batched repos.json files are saved to
- Run
Python: Mainagain but this time select selection option 2PREPARE_BATCHby typing2and pressingENTER- Enter the batch number that will be run by typing the number and pressing
ENTER.
- Enter the batch number that will be run by typing the number and pressing
- Run what is in Step Two
- Run this over and over again until all batches have been run.
yarn run getRepos // In the `.env` set the `LANGUAGE_TO_CHECK=` to the language. E.G `python`, `javascript`, `go`, `hcl`, `powershell`, etc.Note: The property can also be left blank, LANGUAGE_TO_CHECK=, and it will get all languages that the repo states to have.
When using GitHub Actions, we commonly find (especially for non-build languages such as JavaScript) that the code-analysis.yml file is repeatable and consistent across multiple repositories of the same language. About 80% of the time, teams can reuse the same workflow files for the same language. For Java, C++ that number drops down to about 60% of the time. But the reason why we recommend enabling Code Scanning at bulk via language is the code-analysis.yml file you propose within the pull request has the highest chance of being most accurate. Even if the file needs changing, the team reviewing the pull request would likely only need to make small changes. We recommend you run this command first to get a list of repositories to enable Code Scanning. After running the command, you are welcome to modify this file. Just make sure it's a valid JSON file if you do edit.
This script only returns repositories where CodeQL results have not already been uploaded to code scanning. If any CodeQL results have been uploaded to a repositories code scanning feature, that repository will not be returned to this list. The motivation behind this is not to raise pull requests on repositories where CodeQL has already been enabled.
yarn run getRepos // or npm run getReposSimilar to step one, another automated approach is to enable by user access. This approach will be a little less accurate as the file will most certainly need changing between a Python project and a Java project (if you are enabling CodeQL), and the user's PAT you are using will most likely. But the file you propose is going to be a good start. After running the command, you are welcome to modify this file. Just make sure it's a valid JSON file if you do edit.
This script only returns repositories where CodeQL results have not already been uploaded to code scanning. If any CodeQL results have been uploaded to a repositories code scanning feature, that repository will not be returned to this list. The motivation behind this is not to raise pull requests on repositories where CodeQL has already been enabled.
Create a file called repos.json within the ./bin/ directory. This file needs to have an array of organization objects, each with its own array of repository objects. The structure of the objects should look like this:
[
{
"login": "string <org>",
"repos":
[
{
"primaryLanguage": "csv of repo languages that are supported",
"repo": "string <org/repo>",
}
]
}
]As you can see, the object takes a number of boolean keys:
primaryLanguage- Comma separated list of supported Code Scan languages that the repo has:
- javascript
- java
- go
- python
- cpp (C++)
- csharp (C#)
- ruby
- hcl (Terraform)
- powershell
- Comma separated list of supported Code Scan languages that the repo has:
repo- The name of the repo in the following syntax:
org-name/repo-name.
- The name of the repo in the following syntax:
NOTE: The account that generated the PAT needs to have write access or higher over any repository that you include within the repos key.
Run the script which enables Code Scanning (and/or Dependabot Alerts/Dependabot Security Updates/Secret Scanning) on your repository by running:
yarn run start // or npm run startThis will run a script, and you should see output text appearing on your screen.
After the script has run, please head to your ~/Desktop directory and delete the tempGitLocations directory that has been automatically created.
The reason you need this within your .devcontainer/devcontainer.json file is the GITHUB_TOKEN tied to the Codespace will need to access other repositories within your organization which this script may interact with. You will need to create a new Codespace after you have added the above and pushed it to your repository.
Once complete the following will happen:
- Entry will be added to
prs.txtfor each PR that is created - All selected options will be enabled in the repo(s).
As an example if all language criteria were to be met the following example shows what the code-analysis.yml file may look like.
This repository was originally created as a fork of ghas-enablement. It has, since, been converted to a standalone repository that is disconnected from the parent. This repo has had a lot of changes made to it to make it more customizable for our specific needs.
If using this internally at WTW create an ITHD ticket and assign it to the Purple Team if not you can also go to Issues and create one here. Be sure to include specific information like:
- Windows, Linux, Mac
- What version of NodeJS you are running.
- Add any logs that appeared when you ran into the issue.