-
Notifications
You must be signed in to change notification settings - Fork 714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btf: read all line info records at once in parseLineInfoRecords
#1657
base: main
Are you sure you want to change the base?
btf: read all line info records at once in parseLineInfoRecords
#1657
Conversation
9f82efc
to
fb809f7
Compare
parseLineInfoRecords
parseLineInfoRecords
Cool! Drive-by: whenever you optimize old code that doesn't change frequently, make sure to bolt it down with |
Signed-off-by: Paul Cacheux <[email protected]>
fb809f7
to
d5f37dd
Compare
I added a small test for the allocations, I'm a bit sad about it because it checks against 3:
|
d5f37dd
to
4b76be2
Compare
Yea, I would have to agree with that, seems fragile. I imagine it would be frustrating if we have to start updating magic values in out tests when we bump Go / stdlib versions. I don't know if a <10 version hits the mark, like if we go to 11, would that be a cause for us to reconsider something? On the other hand, I understand the desire to curb allocations and memory use caused by the lib. Not sure if this is the best way to do so however. Case and point the failing test, on the last go version it was 7 allocs: https://github.com/cilium/ebpf/actions/runs/12991527462/job/36229383952?pr=1657 |
Discussed this offline with Dylan. I agree on all arguments raised, but:
I think pinning it down to 7 allocs is the best we can do in absence of a continuous benchmarking setup or a solution to track allocs of certain spans of code over time. These alternatives are (much) more expensive to run and not maintenance-free by any means. Also, since this is a wire format defined in a (sort of) spec that doesn't change on a whim, my tendency would be to write a custom marshaler for these things, but I've had trouble convincing Lorenz of this in the past. If we really need this to not allocate, it's still the way to go IMO. |
Ok, I can change the test to |
`binary.Read` is able to directly read its result in a slice, and this is way more efficient from a CPU and memory allocation standpoint │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ParseLineInfoRecords-10 288.6µ ± 3% 141.3µ ± 4% -51.02% (p=0.000 n=10) │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ParseLineInfoRecords-10 128.0Ki ± 0% 128.0Ki ± 0% ~ (p=1.000 n=10) │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ParseLineInfoRecords-10 4098.000 ± 0% 3.000 ± 0% -99.93% (p=0.000 n=10) Signed-off-by: Paul Cacheux <[email protected]>
4b76be2
to
d01fe7c
Compare
When parsing line info records, we currently iterate and call
binary.Read
once per line info record. This is quite inefficient andbinary.Read
already supports out of the box being called directly on a slice and reading multiple structures at once. This is exactly what this PR does.Benchmark: