-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A5-1-9
: Improve performance, address duplication
#857
Conversation
The hash cons value for parameters was incorrectly calculated with parameter uses (e.g accesses to the parameter). The correct approach is to use the variable name and type. This caused performance issues, because the hash cons for a function was made up of all combinations of the accesses to the parameters. For lambdas with many parameters and many accesses, this was problematic.
Move the exclusion of non-lambda blocks to the calculation of HC_BlockStmt, to avoid generating newtype instances for non-lambda instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This pull request improves the performance of query A5-1-9 by addressing issues with parameter hashing, variable name resolution, and block statement processing that previously led to performance bottlenecks and false positives. Additionally, the change note documents these improvements and notes the exclusion of results from the same macro expansion.
Changes
File | Description |
---|---|
change_notes/2025-02-10-improve-perf-a5-1-9.md | Change note detailing performance improvements for IdenticalLambdaExpressions.ql and adjustments to reduce false positives |
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Tip: Copilot only keeps its highest confidence comments to reduce noise and keep you focused. Learn more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! thanks!
Description
Addresses three performance issues with A5-1-9:
HC_ParamCons
was incorrectly defined, referring to uses of the parameters when calculating the hash, instead of properties of the parameter (name and type). This causes exponential blow up in some databases (with lambdas with a lot of parameters and/or lots of uses).Variable.getName()
was being magic-ked with an insufficiently restrictive set of variables, which ultimately led to the calculation ofFunction.getParameter(int i)
creating a cross-product on indexes.HC_BlockStmt
was calculating for a blocks, buthashStmtCons
then filtered out any blocks not in lambdas. I've moved this restriction intoHC_BlockStmt
, which has helped enable us to run even on large databases (e.g. grpc/grpc). Ideally we would apply the lambda restriction to all elements in theLambdaEquivalence.qll
, but this seems to be sufficient for now.Also fixes #856, by excluding results that come from the same macro expansion.
Change request type
.ql
,.qll
,.qls
or unit tests)Rules with added or modified queries
A5-1-9
Release change checklist
A change note (development_handbook.md#change-notes) is required for any pull request which modifies:
If you are only adding new rule queries, a change note is not required.
Author: Is a change note required?
🚨🚨🚨
Reviewer: Confirm that format of shared queries (not the .qll file, the
.ql file that imports it) is valid by running them within VS Code.
Reviewer: Confirm that either a change note is not required or the change note is required and has been added.
Query development review checklist
For PRs that add new queries or modify existing queries, the following checklist should be completed by both the author and reviewer:
Author
As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
Reviewer
As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.