Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] The json file output by semgrep is not correctly scored #186

Closed
zer0yu opened this issue Nov 1, 2022 · 11 comments
Closed

[BUG] The json file output by semgrep is not correctly scored #186

zer0yu opened this issue Nov 1, 2022 · 11 comments

Comments

@zer0yu
Copy link

zer0yu commented Nov 1, 2022

Test ENV
semgrep 0.118 version

Use the following command to test and output the results to a json formatted file.

semgrep --config p/security-audit -q --json -o semgrep.json
@davewichers
Copy link
Contributor

Can you explain what the problem is? Are you seeing errors? Are you seeing results that are not properly incorporated into the scorecard?

@zer0yu
Copy link
Author

zer0yu commented Nov 8, 2022

Yes, not properly incorporated into the scorecard.

@darkspirit510
Copy link
Contributor

This is an issue with SemgrepReader in project BenchmarkUtils. @davewichers is it possibile to move this to the other project?

Anyhow, quick look revealed some structural changes in Semgrep's result files (e. g. CWE is now an array). Im working on it 👍

@planetlevel
Copy link

If a tool reports both the right CWE and a wrong CWE for a test case, how does that affect the scoring? Benchmark shouldn't encourage vendors to report lots of CWEs in the hopes of accidentally listing the right one. You could argue that they should get both a true positive AND a false positive - but I'm not sure how that would affect overall score. Therefore, I think the best thing to do is to simply fail the testcase in this case - the tool didn't get the correct answer.

@davewichers
Copy link
Contributor

Benchmark scoring considers only 1 CWE per test case. Any other CWEs reported are ignored as being irrelevant to what we are testing for. Benchmark has plenty of other vulnerabilities that we didn't specifically intend to include (although we have eliminated a lot of those). But where other vulnerabilities still exist, if we saw those being reported and failed that tool for that test case, lots of tools would be adversely and incorrectly affected by scoring this way.

However, when building scoring for each tool, we do manually look at the CWEs they report, and occasionally they might report a similar CWE when finding the correct vulnerability. In that case we map the CWE to give them credit for this. That similar CWE could be a child CWE, parent CWE, or some other similar type of CWE. DAST tools, for example, might report an issue as some type of injection, when its a different type, but they still did find the vulnerability and report it, so we try to give them credit for it in that case.

@planetlevel
Copy link

I think you may not have understood my comment. This isn’t about extra vulns. Here, semgrep is apparently reporting an array of CWEs for a single vulnerability. In this case there could be some right and some wrong. I’m suggesting this should fail the test case. I have no problem with the mapping stuff…. But it doesn’t help people make good choices or help tools improve if Benchmark gives credit for incorrectly reporting SQLi when the problem is CMDi.

@davewichers
Copy link
Contributor

For SAST I agree. I'll have to look at the semgrep results and work with @darkspirit510 to make sure we give credit (true positives or false positives) correctly.

@darkspirit510
Copy link
Contributor

darkspirit510 commented Nov 12, 2022

@planetlevel as @davewichers mentioned, Benchmark takes a list of findings for every file. Tbh, I did not check what happens in the generation process (it's hard to get through it, that's why I try to reduce complexity by extractions to new classes). If you think, this should be adressed, we should create another issue.

In this special case, semgrep's result file now has an array for CWE numbers and OWASP lists. When you run a recent version of semgrep against Benchmark, none of the results has more than one CWE number:

cat Benchmark_1.2-Semgrep-v0.121.2.json | jq '.results[].extra.metadata.cwe | length' | sort | uniq -c
4473 1

but some have two entries in the OWASP list (because the CWE number belongs to multiple OWASP Top 10 lists)

cat Benchmark_1.2-Semgrep-v0.121.2.json | jq '.results[].extra.metadata.owasp | length' | sort | uniq -c
1370 1
3103 2

Some files have multiple entries in the findings list, but with different paths. I have no idea, how one specific location can contain multiple CWEs (except you name a general CWE like SQL Injection and also a specific CWE like Hibernate Injection).

cat Benchmark_1.2-Semgrep-v0.121.2.json | jq '.results[].path' | sort | uniq -c
   1 "src/main/java/org/owasp/benchmark/testcode/BenchmarkTest00006.java"
   1 "src/main/java/org/owasp/benchmark/testcode/BenchmarkTest00007.java"
   1 "src/main/java/org/owasp/benchmark/testcode/BenchmarkTest00008.java"
   1 "src/main/java/org/owasp/benchmark/testcode/BenchmarkTest00009.java"
   1 "src/main/java/org/owasp/benchmark/testcode/BenchmarkTest00010.java"
[...]
   5 "src/main/java/org/owasp/benchmark/testcode/BenchmarkTest01883.java"
   5 "src/main/java/org/owasp/benchmark/testcode/BenchmarkTest01892.java"
   5 "src/main/java/org/owasp/benchmark/testcode/BenchmarkTest01893.java"
   5 "src/main/java/org/owasp/benchmark/testcode/BenchmarkTest01894.java"

@davewichers
Copy link
Contributor

@darkspirit510 - This is an old issue. Did you fix this in BenchmarkUtils so we can close this?

@darkspirit510
Copy link
Contributor

Yes, see linked PR OWASP-Benchmark/BenchmarkUtils#33 😊

@davewichers
Copy link
Contributor

@darkspirit510 fixed this per above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants