Skip to content

Java: Prune PathGraph for CsrfUnprotectedRequestType.ql #20083

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

aschackmull
Copy link
Contributor

This should be completely behaviour-preserving, but saves hundreds of MBs of output for some repos. Dca has had trouble processing the output of this query even for cases with zero alerts for several repos.

See also comments here: #19872 (comment)

@Copilot Copilot AI review requested due to automatic review settings July 17, 2025 13:11
@aschackmull aschackmull added the no-change-note-required This PR does not need a change note label Jul 17, 2025
@aschackmull aschackmull requested a review from a team as a code owner July 17, 2025 13:11
@github-actions github-actions bot added the Java label Jul 17, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes the PathGraph for the CSRF unprotected request type query by implementing a bidirectional reachability analysis to prune irrelevant edges. The optimization is intended to be behavior-preserving while significantly reducing output size (saving hundreds of MBs) and improving processing performance for the DCA tool.

Key changes:

  • Introduces bidirectional flow analysis predicates (fwdFlow and revFlow) to identify relevant edges
  • Replaces direct CallGraph::edges usage with a pruned relevantEdge predicate
  • Updates the transitive closure computation to use the optimized edge predicate

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
CsrfUnprotectedRequestType.ql Updates query to use pruned edge predicate instead of raw call graph edges
CsrfUnprotectedRequestTypeQuery.qll Implements bidirectional reachability analysis and relevantEdge predicate for path pruning

/**
* Holds if `pred` has a successor node `succ` and this edge is in an
* `unprotectedStateChange` path.
*/
Copy link
Preview

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The predicate should include documentation about its performance characteristics and the bidirectional pruning strategy, as this is a key optimization that affects query behavior.

Suggested change
*/
*/
/**
* Holds if `pred` has a successor node `succ` and this edge is in an
* `unprotectedStateChange` path.
*
* Performance characteristics:
* - This predicate relies on `revFlow`, which is a recursive flow predicate
* that combines forward and reverse graph traversal.
* - The use of `CallGraph::edges` ensures that only valid edges in the call
* graph are considered, reducing unnecessary computations.
*
* Bidirectional pruning strategy:
* - The predicate leverages `revFlow(pred)` and `revFlow(succ)` to prune
* the search space by ensuring that both the predecessor and successor
* nodes are part of valid reverse flows.
*/

Copilot uses AI. Check for mistakes.

@@ -237,12 +237,35 @@ private predicate sink(CallPathNode sinkMethodCall) {
)
}

Copy link
Preview

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation for the fwdFlow predicate. Should explain that it computes forward reachability from sources using transitive closure over call graph edges.

Suggested change
/**
* Computes forward reachability from source nodes using transitive closure
* over call graph edges. Holds if `n` is reachable from a source node.
*/

Copilot uses AI. Check for mistakes.

or
exists(CallPathNode mid | fwdFlow(mid) and CallGraph::edges(mid, n))
}

Copy link
Preview

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation for the revFlow predicate. Should explain that it computes backward reachability from sinks within the forward-reachable nodes, implementing the bidirectional pruning strategy.

Suggested change
/**
* Computes backward reachability from sinks within the forward-reachable nodes.
* This predicate implements the bidirectional pruning strategy by combining
* forward reachability (`fwdFlow`) with backward reachability from sinks.
*/

Copilot uses AI. Check for mistakes.

@aschackmull
Copy link
Contributor Author

The "Query result disk size" metric shows a massive reduction as expected.

@owen-mc
Copy link
Contributor

owen-mc commented Jul 18, 2025

I suppose you could add a change note with the "fix" category, so that if anyone notices a change they can figure out what caused it. Up to you.

@aschackmull
Copy link
Contributor Author

I suppose you could add a change note with the "fix" category, so that if anyone notices a change they can figure out what caused it. Up to you.

After result interpretation the sarif is unchanged, so it's not very detectable.

@aschackmull aschackmull merged commit f697511 into github:main Jul 18, 2025
19 checks passed
@aschackmull aschackmull deleted the java/prune-csrf-unprotected-request-type branch July 18, 2025 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Java no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants