Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SimCalorimeterPulse data type for storing simulated calorimeter pulses pre-digitization #106

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sly2j
Copy link
Contributor

@sly2j sly2j commented Apr 1, 2025

Briefly, what does this PR introduce?

This PR adds a new data type, edm4eic::SimCalorimeterPulse, to represent simulated calorimeter pulses before digitization.

The new structure is modeled consistently with edm4hep::SimCalorimeterHit and aligns with existing waveform structures such as edm4hep::RawTimeSeries (generic digitization output) and edm4hep::TimeSeries (generic measured time series).

SimCalorimeterPulse defines three one-to-many relations relations:

  • SimCalorimeterHit: Pulses can be constructed from contributions of one or more simulated calorimeter hits.
  • SimCalorimeterPulse: Pulses can be constructed out of other pulses.
  • MCParticle: Each pulse can be attributed to a single MCParticle (primary pulse), multiple MCParticles (overlayed pulses), or zero MCParticles (noise pulse).

Storing calorimeter pulses at the simulation level ensures conceptual clarity and consistency within our data model. The simulated pulse data to be stored in this structure have no equivalent in real detector output, and are therefore not digitization output.

By using a Simulation structure, we gain direct Relation support with other simulation entities, enabling efficient backward navigation through the simulation chain. While Associations are used to link digitization and reconstruction structures to simulation, Relations are preferred for internal connections within the same domain (e.g. within Simulation or within Reconstruction). This design avoids introducing redundant Association types and keeps the data model consistent, clean and efficient.

What kind of change does this PR introduce?

  • Bug fix (issue #__)
  • New feature (issue #__)
  • Documentation update
  • Other: __

Please check if this PR fulfills the following:

  • Tests for the changes have been added
  • Documentation has been added / updated
  • Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

No

Does this PR change default behavior?

No

@sly2j sly2j requested a review from a team as a code owner April 1, 2025 18:25
Copy link

@ruse-traveler ruse-traveler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a typo!

- float amplitude // Pulse amplitude in [GeV], sum of amplitude values equals total energy
OneToManyRelations:
- edm4hep::SimCalorimeterHit hits // SimCalorimeterHits used to create this pulse
- edm4hep::SimCalorimeterPulse pulses // SimCalorimeterPulses used to create this pulse

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- edm4hep::SimCalorimeterPulse pulses // SimCalorimeterPulses used to create this pulse
- edm4eic::SimCalorimeterPulse pulses // SimCalorimeterPulses used to create this pulse

Members:
- uint64_t cellID // ID of the readout cell for this pulse.
- float energy // Total energy for this pulse in [GeV].
- float position // Position of the hit in world coordinates [mm].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(comment from @simonge and @ruse-traveler )
Should be a Vector3f.

@@ -244,6 +244,22 @@ datatypes:
## ==========================================================================
## Calorimetry
## ==========================================================================
edm4eic::SimCalorimeterPulse:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would really want to have generic structures to be able to consistently handle basic operations (adding noise, merging pulses, etc) for different detectors, not just calorimeters.

Would it be acceptable to make a more generic-sounding name, even if we keep the edm4hep::SimCalorimeterHit relation?

OneToManyRelations:
- edm4hep::SimCalorimeterHit hits // SimCalorimeterHits used to create this pulse
- edm4hep::SimCalorimeterPulse pulses // SimCalorimeterPulses used to create this pulse
- edm4hep::MCParticle MCParticle // MCParticle that caused the pulse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- edm4hep::MCParticle MCParticle // MCParticle that caused the pulse
- edm4hep::MCParticle particle // MCParticle that caused the pulse

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also a similar change pushed through in edm4hep for technical reasons outlined there; key4hep/EDM4hep@cb738b0#diff-41291f568217a9755476ae21a12825127ae5a00179b19e867cffd64e3ceea36f

- float time // Start time for the pulse in [ns].
- float interval // Time interval between amplitude values [ns].
VectorMembers:
- float amplitude // Pulse amplitude in [GeV], sum of amplitude values equals total energy

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @veprbl during the meeting: should the units here be GeV? Or should these be in units of voltage?

If we choose to generalize the structure, then we should revisit this. Maybe it could be reformulated to be more generic; perhaps similar to the edm4hep::TimeSeries...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is a truth type, not a digitized type. There is no concept of voltage until you have a digitization model. You could argue that energy is not the right quantity for a photon pulse traveling down a fiber either, but it probably should be closer to energy than to voltage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Measured energy deposition is not "truth", it's just another detail of detector technology. Also, there is not much practical use of truth-only type. You will want to process those pulses, and the type of amplitude scale will change along the way, the units won't stay as "GeV".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's truth in the sense that it is exactly what Geant4 gives. It is like energy of calo hit contributions. Yes, you can change the size of the volume those go in, but it is the exact energy contribution for the tracks in that volume, without distorting effects such as poisson distributions, gaussian measurement uncertainty, or bit depth quantization. It is so "true" that in many cases it is not even representative because it is just one sample out of many possibilities.

Comment on lines +252 to +253
- float energy // Total energy for this pulse in [GeV].
- float position // Position of the hit in world coordinates [mm].

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment from @simonge during meeting: the energy and position can technically be obtained from the amplitude vector and cell ID respectively.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or from the related hit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to run into the same discussions as for tracker hits here:

  • someone will want both local and global positions
  • someone will want extent since multiple cell IDs are combined
  • someone will have a volume that is not aligned to any axes where extent makes any sense...

No suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants