Improve identification of reuse of reserved names #951

lcartey · 2025-08-27T10:54:40Z

Description

This pull request improves the identification of reserved names across the supported languages. I originally worked on this last year, and putting this up now as a draft PR for visibility. The main contributions are:

Adopting extensible predicates and data extension files to store the list of names defined by a given language standard - models-as-data is a clearer, more performant and more extensible way to describe the list of names defined in a language standard.
Creating a data extension files generator for C standard library names - this generator takes content copied from the C standard documentation and processes it to produce a data extension file with the identified APIs. As some cases are not fully scrapable from the documentation, we also augment this with a manual file with a few additional APIs.
Updating the existing C++ generator to produce data extension files - we had an existing generator for C++ which produced a hard-coded .qll file. This has been adapted to produce data extension files instead. It uses a "real" codebase to identify which APIs are available
Modifying the C++ data extension generator to filter out more internal and non-relevant APIs - many of the APIs produced by the generator previously were not considered to be reserved names by the standard - for example, internal names or outside the namespace std. These are now filtered out.
Improving the reporting of "owning" header for the C++ data extension generator - internally standard libraries often implement APIs in private headers which are then included by the header that should define it. The generator attempts to find the "best" header for any given API name in the standard library to improve reporting.
Reimplementing reserved name detection for C - a ReservedName.qll has been created which implements reserved name detection/reuse for C, based on the C Standard and the MISRA specific rules. This is adopted by MISRA Rules 21.1 and 21.2, AUTOSAR A17-0-1 and CERT C DCL37-C.
Reimplemented reserved name detection for C++ - ReservedName.qll is extended to support C++ reserved name detection. This has not yet been adopted by the various C++ reserved name queries - I recall that it was challenging to determine how to handle many of the edge cases due to a lack of clarity in both the language standard and the various Coding Standards.

Existing Issue references:

Change request type

Release or process automation (GitHub workflows, internal scripts)
Internal documentation
External documentation
Query files (.ql, .qll, .qls or unit tests)
External scripts (analysis report or other code shipped as part of a release)

Rules with added or modified queries

No rules added
Queries have been added for the following rules:
Queries have been modified for the following rules:
- RULE-21-1
- RULE-21-2
- A17-0-1
- DCL37-C

Release change checklist

A change note (development_handbook.md#change-notes) is required for any pull request which modifies:

The structure or layout of the release artifacts.
The evaluation performance (memory, execution time) of an existing query.
The results of an existing query in any circumstance.

If you are only adding new rule queries, a change note is not required.

Author: Is a change note required?

Yes
No

🚨🚨🚨
Reviewer: Confirm that format of shared queries (not the .qll file, the
.ql file that imports it) is valid by running them within VS Code.

Confirmed

Reviewer: Confirm that either a change note is not required or the change note is required and has been added.

Confirmed

Query development review checklist

For PRs that add new queries or modify existing queries, the following checklist should be completed by both the author and reviewer:

Author

Have all the relevant rule package description files been checked in?
Have you verified that the metadata properties of each new query is set appropriately?
Do all the unit tests contain both "COMPLIANT" and "NON_COMPLIANT" cases?
Are the alert messages properly formatted and consistent with the style guide?
Have you run the queries on OpenPilot and verified that the performance and results are acceptable?
As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
Does the query have an appropriate level of in-query comments/documentation?
Have you considered/identified possible edge cases?
Does the query not reinvent features in the standard library?
Can the query be simplified further (not golfed!)

Reviewer

Have all the relevant rule package description files been checked in?
Have you verified that the metadata properties of each new query is set appropriately?
Do all the unit tests contain both "COMPLIANT" and "NON_COMPLIANT" cases?
Are the alert messages properly formatted and consistent with the style guide?
Have you run the queries on OpenPilot and verified that the performance and results are acceptable?
As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
Does the query have an appropriate level of in-query comments/documentation?
Have you considered/identified possible edge cases?
Does the query not reinvent features in the standard library?
Can the query be simplified further (not golfed!)

This commit adds a module for representing names from the C/C++ standard libraries. It uses models-as-data to represent the names, and provides modules for accessing names in C99, C11 and C++14.

This adds a python script for taking Appendix B of the C Standard Library and converting it to the StandardLibrayNames models-as-data format.

This repurposes the existing module generator to instead generate a mad file for the C++ standard library. It makes the following changes: * Omits names outside the `std` namespace (as they cannot be distinguished from system headers). * Removes the macro query, and adds member variable and type models instead. * Move to a new generator directory. * Update the script to generate a mad file instead of a .qll file.

Ensure models-as-data are output to the correct location.

Exclude more internal and non `std` models.

The library generator did not correctly parse function prototypes with pointer return types. These are now appropriately parsed.

Appendix B of the spec doesn't include the member variables, but there's only a few of them so we specify by hand.

* Update message * Require appropriate include in the non-external linkage case. * Add extra tests

Library macros are not under user control

Remove unnecessary filter on start locations.

Exclude NDEBUG which is not actually defined by any header, and update the generated files (include the outdated C99 file).

MISRA has slightly different rules to CERT, so unshare the rule.

Determine more accurately which header a declaration belongs to: * Identify "closest" imported header * Use manual mapping to disambiguate

lcartey added 30 commits August 27, 2025 10:43

Add module for names from the C/C++ library

be7f810

This commit adds a module for representing names from the C/C++ standard libraries. It uses models-as-data to represent the names, and provides modules for accessing names in C99, C11 and C++14.

Add mad generator for the C Standard Library

1554d78

This adds a python script for taking Appendix B of the C Standard Library and converting it to the StandardLibrayNames models-as-data format.

Fix output locations

bf715b1

Ensure models-as-data are output to the correct location.

Update models file

af00c9f

Exclude more internal and non `std` models.

C: Support pointer return types in library generator

ea7e3b5

The library generator did not correctly parse function prototypes with pointer return types. These are now appropriately parsed.

Add missing update

e21f6d3

Add manual member variable models for C11 and C99.

cae5660

Appendix B of the spec doesn't include the member variables, but there's only a few of them so we specify by hand.

Expose libraryMemberVariableModels

189d66f

Adopt StandardLibraryNames in the C DeclaredAReservedIdentifier query

fcbe4e7

Update message, add additional tests

5f738b3

Refine detection of reserved identifiers

221c0c1

* Update message * Require appropriate include in the non-external linkage case. * Add extra tests

Correctly reflect tag and typedef name spaces

ca51fda

Exclude identifiers generated from library macros

122a020

Library macros are not under user control

Only include macro names when the relevant header is included

172378f

Improve performance on large databases

b91d527

Remove unnecessary filter on start locations.

Migrate STR32-C to new naming library

1fd72e2

Extract ReservedNames library.

f5210ac

Exclude NDEBUG, regenerate

92cfb2f

Exclude NDEBUG which is not actually defined by any header, and update the generated files (include the outdated C99 file).

Migrate Rule 21.2 to the ReservedNames library

6998ed0

Unshared Rule 21.2, and implement with MISRA rules

927640b

MISRA has slightly different rules to CERT, so unshare the rule.

Replace Naming with StandardLibraryNames

a821de9

Migrate A17-1-1 to use StandardLibraryNames

7c94c8c

Migrate M17-0-2 to StandardLibraryNames.

c3977f2

C++: More accurately report declaring header

b2ea30a

Determine more accurately which header a declaration belongs to: * Identify "closest" imported header * Use manual mapping to disambiguate

C++: Remove unnamed and specialization types

d18ada0

C++: Exclude operator bool from the reserved names

a9a4613

C++: Exclude member names unless the declaring type is visible

bbd56e0

C++: Exclude name-header mappings for specializations

384b245

Add a predicate for identifying any name in a standard

6ca9153

lcartey added 4 commits August 27, 2025 10:54

Support C++ in reserved names

cb5ddf5

Migrate A17-0-1 to use ReservedName

740c76d

M17-0-3: Migrate to StandardNaming

7a1eeb2

A17-0-1: Accept output

deb849b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve identification of reuse of reserved names #951

Improve identification of reuse of reserved names #951

Uh oh!

lcartey commented Aug 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Improve identification of reuse of reserved names #951

Are you sure you want to change the base?

Improve identification of reuse of reserved names #951

Uh oh!

Conversation

lcartey commented Aug 27, 2025

Description

Change request type

Rules with added or modified queries

Release change checklist

Query development review checklist

Author

Reviewer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant