-
Notifications
You must be signed in to change notification settings - Fork 1.7k
C++: Speed up the cpp/unbounded-write
query for an upcoming change
#18485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot wasn't able to review any files in this pull request.
Files not reviewed (2)
- cpp/ql/lib/semmle/code/cpp/controlflow/IRGuards.qll: Language not supported
- cpp/ql/src/Security/CWE/CWE-120/UnboundedWrite.ql: Language not supported
Tip: Copilot only keeps its highest confidence comments to reduce noise and keep you focused. Learn more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
* To find the specific guard that performs the comparison | ||
* use `IRGuards.comparesLt`. | ||
*/ | ||
predicate comparesLt(Operand left, Operand right, int k, boolean isLt, AbstractValue value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want this predicate and the one below, given that they just forward arguments to compares_eq
and compares_lt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that the fanout from g
to g.comparesLt(...)
is quite large on certain repositories, so we want to create a way to compute a version of comparesLt
that can be used to prune the set of guards we want to look at early (in the interestingLessThanOrEqual
predicate below).
So the important part is that there's no GuardCondition
as a column to this predicate (as that is what's giving the large fan-out). We can include the ValueNumber
(i.e., the first column from compares_lt
) that is currently omitted from this predicate, but since it wasn't needed for the pruning in UnboundedWrite
I decided to leave it out to not expose that implementation detail.
predicate interestingLessThanOrEqual(Operand left) { | ||
exists(DataFlowImplCommon::NodeEx node | | ||
node.asNode().asOperand() = left and | ||
BarrierFlow::Stages::Stage1::sinkNode(node, _) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my understanding/guessing is that stage one does some cheap flow starting at sources to find sink that could be interesting, and we are using that capability to prune down barriers as well.
Is this something that the data flow library could actually do itself? Prune down barriers in this way. Or could it be bad to do more generally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the lessThanOrEqual
predicate itself blows up due to the fanout from g
in g.comparesLt(left, _, _, true, branch)
(or g.comparesEq(left, _, _, true, branch)
) on some projects containing a function with many guards that guard equivalent expressions (so that there are many guards for the same left
).
As an alternative we could've inlined lessThanOrEqual
and isBarrier
, and then we'd most likely get to a point where the resulting predicate was restricted by an internal call inside the dataflow library that ensures that the node being barrier'd is part of the path from a source to a sink, but this more robust 🙂
This fixes a performance problem in #18017 on some projects compiled as C code (where we infer many new guards on #18017 which causes the barrier to explode in size).
This PR adds a very cheap pruning phase to the
cpp/unbounded-write
query. The "cheapness" comes from the fact that we only evaluate the first stage of dataflow. This provides enough pruning to make the performance impact of #18017 negligible (compare this run with this run).