-
Notifications
You must be signed in to change notification settings - Fork 31
Add podio:input_collections parameter with regex support to filter collections from input files #2026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: wdconinc <[email protected]>
Co-authored-by: wdconinc <[email protected]>
Co-authored-by: wdconinc <[email protected]>
podio:input_collections
to the JEventSourcePODIO factory in src/services/io/podio that is modeled on the behavior of podio:output_collections
in the JEventProcessorPODIO factory, and which adds only those enabled collections to the ...for more information, see https://pre-commit.ci
Co-authored-by: wdconinc <[email protected]>
Co-authored-by: wdconinc <[email protected]>
for more information, see https://pre-commit.ci
…llections from input files (fix: iwyu) (#2029) This PR applies the include-what-you-use fixes as suggested by https://github.com/eic/EICrecon/actions/runs/17135243469. Please merge this PR into the branch `copilot/fix-dfbd7245-eb25-4079-86b3-071bd236aed8` to resolve failures in PR #2026. Auto-generated by [create-pull-request][1] [1]: https://github.com/peter-evans/create-pull-request Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@copilot I merged the include-what-you-use fixes. |
@copilot Please add to the eicrecon-input-collections-test check a command (e.g. |
…s-test Co-authored-by: wdconinc <[email protected]>
Added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run over the entire input file in eicrecon-input-collections-test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't limit output collections.
@copilot Let's modify the eicrecon-input-collections-test to do the following instead (and keep it as a new job, with the same name):
|
…rn with RawHits filtering Co-authored-by: wdconinc <[email protected]>
Restructured the eicrecon-input-collections-test to follow the two-stage pattern as requested in commit 59dd624:
This better simulates real data processing pipelines where only detector hits are available as input, similar to the eicrecon-two-stage-running workflow. |
This PR implements the
podio:input_collections
parameter forJEventSourcePODIO
, modeled after the existingpodio:output_collections
parameter inJEventProcessorPODIO
. This feature allows filtering which collections are loaded from PODIO input files using both exact collection names and regex patterns.Problem
Previously,
JEventSourcePODIO
would always load all available collections from input files. This was problematic when:MCParticles
)Solution
Added a new
podio:input_collections
parameter that:EcalBarrel.*
to match multiple collections efficientlyUsage Examples
Implementation Details
ResolveInputCollections()
method that converts regex patterns to actual collection namesstd::call_once
for first-event processingTesting
Updated
eicrecon-input-collections-test
job in GitHub Actions to test a two-stage processing pipeline:podio-dump --category events
to list the final output collectionsThis testing approach better simulates real data processing workflows where only detector hits are available as input, demonstrating the practical value of the input filtering feature for reconstruction-only pipelines.
This addresses the need for real data processing workflows where MC truth information is not available, while providing a user-friendly regex interface for specifying collection patterns instead of exhaustive lists.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.