Skip to content

Conversation

mhkim-anl
Copy link
Contributor

Briefly, what does this PR introduce?

This PR is for pulse generation from time-clustered SimCalorimeterHits. To generate a pulse from the contributions that are close in time, a timeID is assigned in the SimCalorimeterHitProcessor. Since both SimCalorimeterHit and SimTrackerHit share a common pulse generation mechanism, a shared template algorithm, PulseGeneration, has been implemented.

What kind of change does this PR introduce?

Please check if this PR fulfills the following:

  • Tests for the changes have been added
  • Documentation has been added / updated
  • Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

Does this PR change default behavior?

@mhkim-anl mhkim-anl self-assigned this Aug 7, 2025
mhkim-anl and others added 6 commits August 14, 2025 18:24
Associations are 3rd party objects that relate to two other objects that
they connect.
Relations are just fields pointing to another objects. This method works
with relations and not associations.
wdconinc and others added 2 commits August 21, 2025 00:04
…d template (fix: iwyu) (#2025)

This PR applies the include-what-you-use fixes as suggested by
https://github.com/eic/EICrecon/actions/runs/17116586226.
Please merge this PR into the branch
`2001-pulse-generation-from-time-clustered-simcalorimeterhits-using`
to resolve failures in PR #2004.

Auto-generated by [create-pull-request][1]

[1]: https://github.com/peter-evans/create-pull-request

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@veprbl veprbl force-pushed the 2001-pulse-generation-from-time-clustered-simcalorimeterhits-using branch from 4698884 to 05fafa1 Compare August 22, 2025 03:10
@veprbl
Copy link
Member

veprbl commented Aug 22, 2025

Looks like this increases number of hit contributions and hits:
image
image

…d template (fix: iwyu) (#2030)

This PR applies the include-what-you-use fixes as suggested by
https://github.com/eic/EICrecon/actions/runs/17146753827.
Please merge this PR into the branch
`2001-pulse-generation-from-time-clustered-simcalorimeterhits-using`
to resolve failures in PR #2004.

Auto-generated by [create-pull-request][1]

[1]: https://github.com/peter-evans/create-pull-request

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@mhkim-anl
Copy link
Contributor Author

@veprbl
Thank you for checking how this PR effects to the number of BIC hits and contributions, and also the output file size. I'm sorry that I checked whether the algorithms were working properly, but didn't check how much the output file size increased.

I checked in more detail referring to your comments and commits. First, the increased number of hits and contributions are due to the timeID because it defines additional hits depending on their times.

It seems that setting the ignore_thres too low increased the output file size too much. I confirmed that many pulses reached the maximum number of time bins, which is 10000. I updated the ignore_thres to be 1/10 of the average pulse height for 1 GeV electrons generated at eta = 0.

Could you check if the file size from this change is acceptable? For now, I added EcalBarrelScFiPPulses and EcalBarrelScFiNPulses again for a possible test at Github. For reference, the output file size of my small pythia sample decreased as follows.

No BIC pulses: 27M
BIC pulses with ignore_thres of 1.0e-10: 2.6G
BIC pulses with ignore_thres of 1.0e-6: 104M

@wdconinc
Copy link
Contributor

No BIC pulses: 27M\nBIC pulses with ignore_thres of 1.0e-10: 2.6G\nBIC pulses with ignore_thres of 1.0e-6: 104M

In simulation production campaigns we cannot afford a factor 4 increase in data volume for a single detector (or even in general).

I'd suggest removing this data collection from the default output, enable it in a CI test job to exercise the algorithm (ASAN, TSAN), and at some point we have a downstream algorithm that condenses the data volume again.

@veprbl
Copy link
Member

veprbl commented Aug 23, 2025

Since we now running unit tests on PulseGenerator, we can just disable EcalBarrelScFi{P,N}Pulses in default running. Like @wdconinc says, the downstream factories will be able to produce those as intermediate collections without having them saved. And you can always enable storing those pulses in your local running for private analysis needs.

@mhkim-anl
Copy link
Contributor Author

@wdconinc @veprbl
Thank you for your answers and suggestions. I understood the factor 4 increase is too much to save the EcalBarrelScFi{P, N}Pulses. I removed them from the output.

Then, regarding the next step, I would like to discuss it including the downstream algorithm at the coming simulation meeting. I would appreciate it if you could join the meeting on Aug 26.

By the way, could you let me know up to what factor of data size increase would be acceptable? I think there may still be a possible way to reduce the output size by optimizing both timestep and ignore_thres.

@wdconinc
Copy link
Contributor

By the way, could you let me know up to what factor of data size increase would be acceptable? I think there may still be a possible way to reduce the output size by optimizing both timestep and ignore_thres.

As a rule of thumb, we don't want a single detector subsystem to take more than 20% of resources (storage, computing).

veprbl
veprbl previously approved these changes Aug 25, 2025
Copy link
Member

@veprbl veprbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in good shape now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pulse generation from time-clustered SimCalorimeterHits using a shared template
3 participants