Skip to content

Conversation

d-e-s-o
Copy link
Collaborator

@d-e-s-o d-e-s-o commented Jun 12, 2025

Both BPF and vDSO symbols don't really have an easily identifiable module and so currently we don't report one. However, in some scenarios it would be helpful to have some indication of where these symbols come from.
With this change we associate the string "[bpf]" with BPF program symbols and "[vdso]" with vDSO symbols. We could alternatively introduce an enum capturing this information in a more structured way, but really it does not seem worth it.

Both BPF and vDSO symbols don't really have an easily identifiable
module and so currently we don't report one. However, in some scenarios
it would be helpful to have some indication of where these symbols come
from.
With this change we associate the string "[bpf]" with BPF program
symbols and "[vdso]" with vDSO symbols. We could alternatively introduce
an enum capturing this information in a more structured way, but really
it does not seem worth it.

Signed-off-by: Daniel Müller <[email protected]>
@d-e-s-o d-e-s-o force-pushed the topic/special-sauce-module branch from 87d9b22 to afa9934 Compare June 12, 2025 18:19
@d-e-s-o
Copy link
Collaborator Author

d-e-s-o commented Jun 12, 2025

I am still somewhat torn as to whether we should make these special modules more of a first class thing by introducing an enum ala

enum Module {
  Name(Option<Cow<OsStr>>),
  Bpf,
  Vdso,
}

instead.

@d-e-s-o d-e-s-o marked this pull request as draft June 12, 2025 18:21
@anakryiko
Copy link
Member

for kernel module, this would be yet another case for that enum Module, right? So maybe it does make sense to make this a first class citizen?

@d-e-s-o
Copy link
Collaborator Author

d-e-s-o commented Jun 12, 2025

for kernel module, this would be yet another case for that enum Module, right? So maybe it does make sense to make this a first class citizen?

I think for kernel modules we should be able to either report the path to the module on the file system (if it can be discovered) or just its name (as included in kallsyms) -- both of which could be captured by the proposed Name variant. We could theoretically have a Module(String) variant as well, yeah, but given that we separate between user space symbolization and kernel space symbolization already at the input, I am not sure whether there is much value-add to be had by having a separate variant.

@d-e-s-o
Copy link
Collaborator Author

d-e-s-o commented Jun 12, 2025

But...if we had an enum we could make it possible to more easily differentiate between paths and other "names", I suppose. In the end it may be a question of how users, in the majority of cases, would use this module attribute. If it's just to garnish the output then there is probably little value by having an enum, but if there is a strong reason to handle different variants differently, then I suppose we can go the enum route (I likely wouldn't introduce an enum for the C API, though, I think it may be too unwieldy).

@anakryiko
Copy link
Member

For modules ideally we can have module name (a short string which we can always get from kallsyms) and separately optional path to the module file itself, which we can't guarantee that we'll be able to always find. So that's why enum seems necessary, as you can have multiple fields per each class. So with one enum we are encoding three pieces: the fact it's a module, what its identifier is, and, if we can get it, what is the path to underlying ELF file.

And for BPF, it's similar. We need a fact that it's a BPF program (that's enum variant itself), but also we can provide containing BPF program information (name and/or ID, maybe hash, not sure), in addition to actual BPF subprogram to which the address belongs. (well, at least having an ability to extend API to provide this information seems prudent)

So overall, depending on what kind of "module" (basically, container of code) we work with, there could be various interesting pieces of information that would be useful to provide (and it feels like this would be useful beyond just pretty-printing). So one string might be too limiting.

At least on Rust side this looks appropriate. I agree that C API is trickier, but we can think about this separately.

@d-e-s-o
Copy link
Collaborator Author

d-e-s-o commented Jul 3, 2025

We discussed this offline a while back and the conclusion we came to was that an enum is fine, but we want to keep variants at a single member and not overload them too much. Everybody will be penalized if we put too much data into one of the variants.

Copy link

github-actions bot commented Aug 3, 2025

This pull request is considered stale because it has been open 30 days with no activity. Remove stale label or comment or it will be closed in 5 days.

@github-actions github-actions bot added the Stale label Aug 3, 2025
@danielocfb danielocfb removed the Stale label Aug 5, 2025
@d-e-s-o d-e-s-o mentioned this pull request Sep 2, 2025
6 tasks
Copy link

github-actions bot commented Sep 5, 2025

This pull request is considered stale because it has been open 30 days with no activity. Remove stale label or comment or it will be closed in 5 days.

@github-actions github-actions bot added the Stale label Sep 5, 2025
@d-e-s-o d-e-s-o removed the Stale label Sep 6, 2025
@d-e-s-o
Copy link
Collaborator Author

d-e-s-o commented Sep 16, 2025

I think we should just move ahead with what we have here. After looking into it more, I find the enum approach more and more questionable. Aside from the size increase, which we could probably brush off, it's not really clear what dimension to enumerize. I already alluded to the need to differentiate between path and name, but the "component" dimension as mentioned earlier is also a possibility (BPF, vDSO, kmod, ...).

I tried implementing the former, but it gets messy quickly and doesn't seem worth it. For the latter, we are now suddenly introducing a bunch of "unreachable" variants into the output. E.g., the input already differentiates between user space and kernel space, but we report all variants irrespectively. Also not a huge deal, but it adds up.

The other factor is that this really only applies to process/kernel symbolization, not symbolization from other "file sources". So that gets us back to having a different abstraction for symbolizing "container sources" of sorts to begin with. Perhaps we could have an "extension" member tagged onto the Sym type eventually. But that is out of scope for v0.2. Irrespectively, I am not comfortable with any of the enum based approaches given the various trade-offs. Let's roll with what we have to get closure here, as this change is already an improvement.

@d-e-s-o d-e-s-o marked this pull request as ready for review September 16, 2025 17:30
@d-e-s-o d-e-s-o force-pushed the topic/special-sauce-module branch 2 times, most recently from 87cf87c to 01c1087 Compare September 16, 2025 17:37
Copy link

codecov bot commented Sep 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.67%. Comparing base (57cc3a5) to head (01c1087).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1183      +/-   ##
==========================================
- Coverage   95.68%   95.67%   -0.01%     
==========================================
  Files          60       60              
  Lines       10782    10781       -1     
==========================================
- Hits        10317    10315       -2     
- Misses        465      466       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@d-e-s-o d-e-s-o merged commit 7a73def into libbpf:main Sep 16, 2025
44 checks passed
@d-e-s-o d-e-s-o deleted the topic/special-sauce-module branch September 16, 2025 17:49
@d-e-s-o d-e-s-o mentioned this pull request Sep 23, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants