Skip to content

Add a CodeQL extractor for SAP CAP cds files #158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Nov 19, 2024
Merged

Conversation

lcartey
Copy link
Contributor

@lcartey lcartey commented Oct 16, 2024

This pull requests adds a native CodeQL extractor for the cds service description files from SAP's cap framework.

The cds extractor supports the following scripts:

  • index-files.sh - this script enables indexing a specified list of cds files. It works by:
    • Installing the @sap/cds-dk tool that provides the cds compile command. It installs the package in each directory with a package.json file which depends on the @sap/cds package, and will install a version compatible with the declared versions in the package.json file
    • Running npx cds compile -2 json --locations -o <...> ... for each provided cds file, to produce a JSON output file representing the AST for the file.
    • Running the JavaScript autobuilder, with restrictions in place to ensure we only capture the .cds.json files generated by the compiler.
  • autobuild.sh - this provides an autobuilder which detects all .cds files that don't exists outside node_modules, and calls the codeql database index-files --language cds command to index them.

In addition to the cds extractor, I've also added the following:

  • A pre-finalize.sh script that can be injected into an existing instance of the JavaScript extractor, that enables automatic extraction of CDS files. This is most useful with a custom bundle as generated by the CodeQL Development Toolkit (qlt). The script also recognises the CODEQL_EXTRACTOR_CDS_SKIP_EXTRACTION flag, which can be set to skip extraction of CDS files.
  • A cds-compile.sh script that can be used with the --command option to a codeql database create --language javascript call to add compilation of CDS files manually.

Finally, I've updated the Code Scanning workflow to use/test the new CDS extractor.

Some follow up tasks I would look at as separate PRs are:

  • Windows support - this PR only supports Linux, but Windows would be a relatively straightforward extension.
  • GitHub Actions support - for customers who don't use a custom bundle (e.g. those on open source, or using GitHub.com), we could provide a native GitHub Action to trigger the extractor as part of a regular JavaScript build.
  • Better filtering of manual calls to index-files - we don't currently validate that the list of files provided to index-files only contains .cds files, and we don't optimize how often we install the @sap/cds-dk package in a large monorepo with only a subset EDIT: this second part is now done.
  • Baseline information for CDS files - we could potentially modify the baselining for JS to take into account CDS files, but it would be substantially more invasive, requiring modifying not just adding files.

This creates a cds extractor which implements:
 * An `index-files.sh` script that compiles the list of selected
   cds files using a compatible version of the cds compiler, then
   processes the resulting jsons into a database with a JS schema.
 * An `autobuild.sh` script which calls the `index-files.sh`
   script for each <10MB cds file in repository which is not in a
   node_modules directory.
 * A dbscheme based on the JS extractor.
@lcartey lcartey requested a review from jeongsoolee09 October 16, 2024 14:10
Copy link
Contributor

@jeongsoolee09 jeongsoolee09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still reading through, but I've added a couple of quick thoughts.

Copy link
Contributor

@jeongsoolee09 jeongsoolee09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment and a question. Otherwise, LGTM. Thank you!

@jeongsoolee09
Copy link
Contributor

Another question: Does the end user lose the capability of firing the chain of database init --language through database finalize with a codeql database create --language=cds if database init and database finalize require different values for their --language flags?

I don't think this is a significant downside since the user has to run pre-finalize.sh for the options anyways.

npx can sometimes install the inappropriate version of the
cds compiler, if one is specified by a grandparent directory.
We therefore update the script to install the cds command in
each relevant directory with a package.json before using the
npx command.
 - Include .cds files
 - Exclude files in node_modules directories
Also silent the npm install command
Some test cases require it.

Also push the env vars into the most specific block, just to
ensure that the CDS extraction is not confused.
Copy link
Contributor

@jeongsoolee09 jeongsoolee09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you!

@jeongsoolee09 jeongsoolee09 merged commit 04acfd8 into main Nov 19, 2024
5 checks passed
@jeongsoolee09 jeongsoolee09 deleted the lcartey/cds-extractor branch November 19, 2024 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants