Speeds up module.add_debug_info and add_metadata by 2x by bslatkin · Pull Request #887 · numba/llvmlite

bslatkin · 2022-10-31T11:07:35Z

Only generates the key for module._metadatacache a single time instead of twice on cache miss
Avoids extra assignments in NameScope._basenamemap when there are name conflicts
Caches computed hashes for MDValue and DIValue instances, while ensuring they'll still pickle
Adds a test to measure performance of this path so regressions will be easy to spot
Adds a test to make sure pickling behavior continues to work

In my own compiler's codebase, this change reduces the time it takes for my debug info regression test to complete from 180 seconds down to 20 seconds. The more nesting of metadata there is, the larger the effect.

bslatkin · 2022-10-31T14:37:43Z

Looks like I need to run flake8 and address some style issues. Will do that shortly.

gmarkall · 2022-11-01T10:49:32Z

Thanks for the PR! I'm curious about

In my own compiler's codebase, ...

Is there any public info about your compiler?

bslatkin · 2022-11-01T16:02:49Z

It's not public yet! Hopefully early next year I'll publish it for people to see. There are a couple of sneak peeks here:

https://twitter.com/haxor/status/1580959518984241158
https://twitter.com/haxor/status/1582878447134572544

I will also share the private project with you, Graham, so you can take a look for context.

The part that this PR specifically improves is how I assign builder.debug_metadata to a new value on nearly every token in the source file during code generation (see the file src/code_generator.py in the private project). Doing this provides excellent debug information when using tools like lldb, but it slows down deduplicating DIValues.

Thank you for looking and the attention on this PR!

bslatkin · 2022-11-07T20:32:56Z

Is there anything else you need from me on this PR? Just making sure. Thank you for the reviews!

gmarkall · 2022-11-07T21:25:46Z

Thanks for checking in - it's taking me a little while to review as I'm unfamiliar with most of the parts that are changed in this PR - apologies for the delay. I do have some review notes drafted but I'd like to finish the review before I post them.

gmarkall

Many thanks for the PR, and apologies for the delay in reviewing.

The performance measurement reported by test_debug_info_caching is more than doubled (execution time is less than half) when using this patch on my machine, but I can see how it has even more performance benefit for real-world use cases (as you report in the description). I think this is a good change overall.

I have a few questions / comments, which are mainly on the diff.

In addition, I think the debug metadata performance measurement is valuable to have because it'll enable us to work out where a performance regression occurred (if we notice one) but I don't think we can keep it in the unit test suite itself - generally the test suite should not output anything, and the test more than doubles the testsuite execution time on my system. Could you please:

Copy the code from test_debug_info_caching into a standalone example in the examples directory, complete with timing and output?
Remove the timing and printing from test_debug_info_metadata, and have the test code only execute a single time (as opposed to 10) so that the test takes less time? I think it should be possible to simplify the test and make it much more "straight-line code" since separate functions won't be needed for timeit in the context of the test.

I hope the comments are clear and helpful - please let me know if any clarification or additional information would be helpful.

gmarkall · 2022-11-03T12:09:38Z

llvmlite/ir/_utils.py

    def __init__(self):
        self._useset = set([''])
-        self._basenamemap = defaultdict(int)
+        self._basenamemap = {}


Is a defaultdict slower than a normal dict? It seems that making this change here requires you to then emulate a default dict below - is there a reason not to keep it as a defaultdict?

(Follow-up - I tried reverting the changes in this file and it seemed to make no performance difference in test_debug_info_caching)

gmarkall · 2022-11-03T12:10:53Z

llvmlite/ir/_utils.py

+        try:
+            ident = self._basenamemap[basename]
+        except KeyError:
+            ident = 0


This is the emulation of a default dict I mentioned in the comment above - could it just be

Suggested change

try:

ident = self._basenamemap[basename]

except KeyError:

ident = 0

ident = self._basenamemap[basename]

with _basenamemap remaining a defaultdict?

gmarkall · 2023-01-13T12:46:09Z

llvmlite/ir/values.py

                                      types.MetaDataType(),
                                      name=name)
        self.operands = tuple(values)
+        self.hash_cache = None


It feels like hash_cache is an implementation detail rather than an attribute we'd want to expose - should this be _hash_cache instead?

gmarkall · 2023-01-13T12:54:31Z

llvmlite/ir/values.py

+        # Ensure that the hash is not cached between Python invocations
+        # due to pickling or other serialization. The hash seed changes
+        # which will cause these not to match.
+        self.hash_cache = None


It seems a bit odd to mutate state inside __getstate__ - do you see any downside to removing the hash cache from a copy of the dict?

gmarkall · 2023-01-13T12:55:04Z

llvmlite/ir/values.py

+        self.hash_cache = None
+        return self.__dict__


e.g.

Suggested change

self.hash_cache = None

return self.__dict__

state = self.__dict__.copy()

state['hash_cache'] = None

return state

gmarkall · 2023-01-13T12:55:42Z

llvmlite/ir/values.py

        self.is_distinct = is_distinct
        self.kind = kind
        self.operands = tuple(operands)
+        self.hash_cache = None


Similar comment on the name to above, i.e. should it be _hash_cache?

gmarkall · 2023-01-13T12:56:07Z

llvmlite/ir/values.py

+        # Ensure that the hash is not cached between Python invocations
+        # due to pickling or other serialization. The hash seed changes
+        # which will cause these not to match.
+        self.hash_cache = None


Also should this invalidate the hash cache on a copy?

gmarkall · 2023-01-13T12:59:42Z

llvmlite/tests/test_ir.py

+
+        print('test_debug_info_performance took', total_time, 'to finish')
+
+        self.assertEqual(100004, len(mod._metadatacache))


I think the expected result could do with a comment explaining its derivation - it took me a while to work out that it's 100000 locations, plus one file, one compile unit, one subprogram, and [di_subprogram, di_subprogram].

gmarkall · 2023-01-13T13:17:08Z

llvmlite/ir/module.py

-            md = self._metadatacache[key]
+
+        fixed_operands = self._fix_metadata_operands(operands)
+        key = hash(tuple(fixed_operands))


In the PR description, I see

Only generates the key for module._metadatacache a single time instead of twice on cache miss

I'm not sure I can understand where it was generated twice before - am I missing that somewhere?

In any case, I do think the changes here are an improvement - reverting the specific changes to this file does slow down the performance test in test_debug_info_caching below.

bslatkin · 2023-02-04T16:11:20Z

Hi @gmarkall! My apologies for the delay. This is great feedback. I'll start working through it and update the PR once I've addressed everything. Thank you

esc · 2023-07-10T04:26:50Z

@bslatkin Hi! and thank you for your initial work on this. I am the release manager for 0.41.x and am checking on the status of this PR. I see that it has been in progress for quite some time. 0.41.x is now scheduled to be tagged during the week 7th August 2023, which is in 4 weeks time. What do you think, will this PR make it by then?

bslatkin · 2023-07-10T04:28:28Z

Thank you for the encouragement! I'm behind where I wanted to be on this, obviously :) I'll give it a try to turn it around before then. But I'm not sure how much additional review it might take, especially during the summer time. So perhaps I should aim for the release after that?

esc · 2023-07-10T06:39:45Z

Thank you for the encouragement! I'm behind where I wanted to be on this, obviously :) I'll give it a try to turn it around before then. But I'm not sure how much additional review it might take, especially during the summer time. So perhaps I should aim for the release after that?

It's up to you, just let me know what you prefer. Reviewers are the most scarce resource in the Numba ecosystem so I share your gut feeling that perhaps the 0.42 release is a more realistic goal.

Here is the Gnatt chart for the next few months, which shows the approximate tag dates for Numba 0.59 and 0.60 -- these will coincide with llvmlite releases, as the two libraries are released in lockstep:

numba/numba#8971

Effectively, the next releases will be in the Nov 2023 (0.42.x) and Feb 2024 (0.43.x) time-frame.

gmarkall · 2024-11-20T10:38:59Z

@bslatkin Hope you are well - it's been a while! I'm having a bit of a tidy-up, and checking in - do you see a future for this PR?

bslatkin added 10 commits October 31, 2022 02:04

Reduces repeated sorting and hashing to speed up adding debug info.

84812fa

Making adjustments to make module run faster

1edf17d

Adjusting test parameters

2b9dc33

Simplifying test

e5585c7

Tests are showing 15% faster at adding debug info

1032046

Fixed a bug

6e4590c

Adding a cache on the hashes

0281979

Benchmark now shows it takes 50% less time

bffb1f3

Turning off print message

c5ad886

Remove profiler and always print the performance information

4960a20

esc added the 3 - Ready for Review label Oct 31, 2022

Fixes flake8 errors

7f3392d

esc assigned gmarkall Nov 1, 2022

esc added this to the v0.40.0 milestone Nov 1, 2022

esc assigned apmasell Nov 1, 2022

apmasell approved these changes Nov 1, 2022

View reviewed changes

gmarkall added 4 - Waiting on second reviewer and removed 3 - Ready for Review labels Nov 1, 2022

gmarkall requested changes Jan 13, 2023

View reviewed changes

gmarkall added 4 - Waiting on author and removed 4 - Waiting on second reviewer labels Jan 13, 2023

gmarkall modified the milestones: v0.40.0, v0.41.0 Mar 28, 2023

esc removed this from the v0.41.0rc1 milestone Jul 13, 2023

esc unassigned apmasell and gmarkall Jul 13, 2023

-        self.hash_cache = None
-        return self.__dict__
+        state = self.__dict__.copy()
+        state['hash_cache'] = None
+        return state


		print('test_debug_info_performance took', total_time, 'to finish')

		self.assertEqual(100004, len(mod._metadatacache))

Conversation

bslatkin commented Oct 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bslatkin commented Oct 31, 2022

Uh oh!

gmarkall commented Nov 1, 2022

Uh oh!

bslatkin commented Nov 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bslatkin commented Nov 7, 2022

Uh oh!

gmarkall commented Nov 7, 2022

Uh oh!

gmarkall left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bslatkin commented Feb 4, 2023

Uh oh!

esc commented Jul 10, 2023

Uh oh!

bslatkin commented Jul 10, 2023

Uh oh!

esc commented Jul 10, 2023

Uh oh!

gmarkall commented Nov 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bslatkin commented Oct 31, 2022 •

edited

Loading

bslatkin commented Nov 1, 2022 •

edited

Loading