Limit compilation units cache size #626

mohamedelkony · 2025-09-07T13:15:24Z

Issue:

Iterated CUs (along side with each CU's DIEs cache) are kept in cache as long as DARWF object is alive, While cache can be useful for hitting DIEs shared be CUs and other use cases

A simple loop over CUs and their DIEs keeps all DIEs of ELF stored in memory until DWARF object is no longer referenced.
For a big ELF file ours is ~100 MBs this causes all DIEs of all CUs to stay in memory while we only need to parse DIEs of current CU and then we no longer need to keep those DIE in memory leading to memory leak as we will never use those DIEs on previous CUs (except for interleaved DIE case).

This leads to excessive memory usage it reaches 7 GBs, We parse 5 files in parallel this leads to 35 GBs memory usage

Fix:

Add possibly to set max cache size on cached CU, Purge Chace FIFO on reaching max size
Kept behavior same as before cache size is unlimited

Results:

Reduced memory usage by 90%

Memory usage for our 100MB ELF file

Before: 4GB

After: 350 MBs

Used test case:

mohamedelkony · 2025-09-07T20:14:15Z

Hi @eliben Could you please have a look? Thanks!

eliben · 2025-09-08T13:12:11Z

elftools/dwarf/dwarfinfo.py

                keyword syntax.
+            
+            max_cu_cache_size:
+                Enforces a limit on CU cache size, For unlimitted  cache size set to -1


The size limit is on the number of entries/CUs? The comment should say this in more detail.

eliben · 2025-09-08T13:12:28Z

@sevaa WDYT?

sevaa · 2025-09-08T17:55:33Z

Only helps with the firehose parsing scenario with no storing of results, and only if you have many small CUs as opposed to several large ones. Also, keeping the cache alive (and the memory usage up) by accident is really easy; holding a reference to as much as one DIE in an evicted CU keeps the whole CU (with caches) in place.

This patch may alleviate some OOM-on-large-binary scenarios that users are complaining about, but I can see OOM scenarios it won't address.

By the time the OOM condition gets to us, it's usually boiled down to a minimal example, but we don't get to see the users' real life code that crashes.

eliben · 2025-09-08T18:17:58Z

Thanks, @sevaa

@mohamedelkony in addition to the other comment, could you add a test that exercises this functionality?

sevaa · 2025-09-08T18:25:33Z

Also, I don't like adding extra parameters to the DWARFInfo constructor :) It's effectively a public API on the DWARF-not-in-ELF side of things - gratuitous breaking change and all that. As we (and the DWARF committee, and the compiler vendors) add more sections into the mix, an optional final parameter will break the constructor invocations in a way that's no fun to debug. A section object will end up taking the place of the cache limit, and they will get a bogus error of "cannot compare a section descriptor to a number" way downstream... Not cool.

Also, my preferred approach would be a plain no-cache mode rather than a limit on cache size. A limit needs to be pretty much hand tuned to a specific binary or corpus - note how in the OP's own example, the cache limit is zero, effectively no cache mode anyway. The CU cache is a relatively recent addition - making its maintenance optional won't be too much of a lift IMHO. The DIE cache in the CU would be slightly trickier to tackle, and finally, DIEs have a way of storing references to one another (_parent and _terminator) that also ends up eating up memory, if DIEs references are long lived.

Limit_CU_cache_size

66d346d

mohamedelkony marked this pull request as ready for review September 7, 2025 20:13

eliben reviewed Sep 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Limit compilation units cache size #626

Limit compilation units cache size #626

Uh oh!

mohamedelkony commented Sep 7, 2025 •

edited

Loading

Uh oh!

mohamedelkony commented Sep 7, 2025

Uh oh!

eliben Sep 8, 2025

Uh oh!

eliben commented Sep 8, 2025

Uh oh!

sevaa commented Sep 8, 2025

Uh oh!

eliben commented Sep 8, 2025

Uh oh!

sevaa commented Sep 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Limit compilation units cache size #626

Are you sure you want to change the base?

Limit compilation units cache size #626

Uh oh!

Conversation

mohamedelkony commented Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue:

Fix:

Results:

Before: 4GB

After: 350 MBs

Used test case:

Uh oh!

mohamedelkony commented Sep 7, 2025

Uh oh!

eliben Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

eliben commented Sep 8, 2025

Uh oh!

sevaa commented Sep 8, 2025

Uh oh!

eliben commented Sep 8, 2025

Uh oh!

sevaa commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mohamedelkony commented Sep 7, 2025 •

edited

Loading

sevaa commented Sep 8, 2025 •

edited

Loading