Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for cached_properties to slotted attrs classes. #1200

Conversation

diabolo-dan
Copy link
Contributor

@diabolo-dan diabolo-dan commented Nov 7, 2023

Summary

Proof of concept for detecting and adapting cached_property decorated functions for slotted classes. This addresses:

Caching dynamic properties when using slots=True. #164

Pull Request Check List

  • Do not open pull requests from your main branch – use a separate branch!
    • There's a ton of footguns waiting if you don't heed this warning. You can still go back to your project, create a branch from your main branch, push it, and open the pull request from the new branch.
    • This is not a pre-requisite for your your pull request to be accepted, but you have been warned.
  • Added tests for changed code.
    Our CI fails if coverage is not 100%.
  • New features have been added to our Hypothesis testing strategy.
  • Changes or additions to public APIs are reflected in our type stubs (files ending in .pyi).
    • ...and used in the stub test file tests/typing_example.py.
    • If they've been added to attr/__init__.pyi, they've also been re-imported in attrs/__init__.pyi.
  • Updated documentation for changed code.
    • New functions/classes have to be added to docs/api.rst by hand.
    • Changes to the signature of @attr.s() have to be added by hand too.
    • Changed/added classes/methods/functions have appropriate versionadded, versionchanged, or deprecated directives.
      The next version is the second number in the current release + 1.
      The first number represents the current year.
      So if the current version on PyPI is 22.2.0, the next version is gonna be 22.3.0.
      If the next version is the first in the new year, it'll be 23.1.0.
  • Documentation in .rst and .md files is written using semantic newlines.
  • Changes (and possible deprecations) have news fragments in changelog.d.
  • Consider granting push permissions to the PR branch, so maintainers can fix minor issues themselves without pestering you.

@diabolo-dan diabolo-dan force-pushed the support_for_cached_properties_in_slotted_classes branch from b1ea100 to 80be0a7 Compare November 7, 2023 17:33
@diabolo-dan diabolo-dan changed the title Add support for cached_properties to slotted attrs classes. DRAFT: Add support for cached_properties to slotted attrs classes. Nov 7, 2023
@diabolo-dan diabolo-dan changed the title DRAFT: Add support for cached_properties to slotted attrs classes. Add support for cached_properties to slotted attrs classes. Nov 8, 2023
@diabolo-dan diabolo-dan force-pushed the support_for_cached_properties_in_slotted_classes branch 2 times, most recently from 29e20e4 to aca9728 Compare November 9, 2023 09:25
@hynek
Copy link
Member

hynek commented Nov 11, 2023

Since cached_property is 3.8+, 3.7 is failing.

In my other projects I've dropped 3.7 long ago, but attrs has still the most downloads on 3.7 so I would like to let it live on a little longer. Would it be a lot of effort to make your gate this feature to 3.8?

Copy link
Member

@Tinche Tinche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM but some questions.

Do cached properties make sense on frozen classes? Philosophically and implementation-wise?

Second, since __getattr__ doesn't chain, will this scenario work:

@define
class A:
    @cached_property
    def a(self) -> int:
        return 1

@define
class B(A):
    @cached_property
    def b(self) -> int:
        return 2

B().a

We might need to crawl the MRO for the final __getattr__?

src/attr/_make.py Show resolved Hide resolved
@diabolo-dan
Copy link
Contributor Author

Since cached_property is 3.8+, 3.7 is failing.

In my other projects I've dropped 3.7 long ago, but attrs has still the most downloads on 3.7 so I would like to let it live on a little longer. Would it be a lot of effort to make your gate this feature to 3.8?

Ah, I'd forgotten that cached_proeperty was a "new" addition

I've added a commit that restricts the checks to 3.8+, and checked it locally, so it should be working now.

@diabolo-dan
Copy link
Contributor Author

Generally LGTM but some questions.

Do cached properties make sense on frozen classes? Philosophically and implementation-wise?

IMHO, they work well together philosophically. A cached property is persistent for the duration of an instance, so it's already "frozen", it's just a question of delaying the computation (potentially indefinitely, if it's never needed). That makes it roughly comparable to a lazy val in scala, which is compatible with immutability from an interface perspective.

And, I don't see any issues implementation wise -- my implementation already works with frozen classes.

Second, since __getattr__ doesn't chain, will this scenario work:
[...]
We might need to crawl the MRO for the final __getattr__?

Yeah, good point. I've played around a bit, and thought I had a working solution, but I think it breaks down on multiple inheritance. I'll try to work something out.

@diabolo-dan
Copy link
Contributor Author

I've pushed a fix to crawl the mro for __getattr__ (via super). I've also switched to using _make_method as that seems to be the more consistent/preferred approach, though I needed a wrapper to get the closure working properly.

To get the original_getattr closure picking up the new class, I put it on the class dictionary. I suppose if we want to avoid this, we could instead add an explicit update for the closure cells for that function.

@hynek
Copy link
Member

hynek commented Nov 16, 2023

Thanks for working on this – I'm really excited for this to land! I'll leave the actual reviewing to Tin, but 2 quick notes/questions:

  1. You're missing (branch) coverage which made the CI red:
Name                Stmts   Miss Branch BrPart  Cover   Missing
---------------------------------------------------------------
src/attr/_make.py     954      0    455      1    99%   923->929
---------------------------------------------------------------
TOTAL                1638      0    718      1    99%
  1. I think I know the answer, but just to double-check: this has no performance impact if I don't use cached_propertys, correct?

@diabolo-dan diabolo-dan force-pushed the support_for_cached_properties_in_slotted_classes branch from d447b1c to c1b3a9e Compare November 17, 2023 15:36
@diabolo-dan
Copy link
Contributor Author

diabolo-dan commented Nov 17, 2023

You're missing (branch) coverage which made the CI red:

Hmmm, It looks like coverage data is only being uploaded for 3.7,3.10, and 3.11.

The branch that coverage is complaining about only gets hit in 3.8/3.9 because __annotation__ are ~always set in 3.10+, and the code is flagged behind 3.8

I've pushed a commit to maintain comparable behaviour which I believe should resolve the coverage problem in any case.

I think I know the answer, but just to double-check: this has no performance impact if I don't use cached_propertys, correct?

There is some extra code at class creation (for slotted classes) to check for cached_properties:

if PY_3_8_PLUS:
    cached_properties = {
        name: cached_property.func
        for name, cached_property in cd.items()
        if isinstance(cached_property, functools.cached_property)
    }
[...]
if cached_properties:

Some basic numbers for reference (timing for 1000 attrs.frozen applications):

upstream/main:

With 0 methods, took [0.534622973005753, 0.5135590879945084, 0.5537665199954063, 0.543172550998861, 0.5421680030121934]s
With 1 methods, took [0.5519589369941968, 0.552411498996662, 0.5460843400214799, 0.5372608639881946, 0.5253033999761101]s
With 10 methods, took [0.5476001699862536, 0.5778162179922219, 0.5552783849998377, 0.5771896800142713, 0.5804753249976784]s
With 100 methods, took [0.6648160929908045, 0.6790099880017806, 0.655211570992833, 1.1942919419961981, 0.843360381986713]s
With 1000 methods, took [1.327303853991907, 1.2969695710053202, 1.3404755429946817, 1.2886629370041192, 1.3148314789868891]s
With 10000 methods, took [8.27522504600347, 9.543007285014028, 9.15497858900926, 8.78463426901726, 9.318197846994735]s

vs

New changes

With 0 methods, took [0.5188219360134099, 0.5184214509790763, 0.5280631629866548, 0.5325713030179031, 0.5457004369818605]s
With 1 methods, took [0.5420163899834733, 0.545073927991325, 0.5370112269883975, 0.5418282540049404, 0.5468027850147337]s
With 10 methods, took [0.5930820080102421, 0.5426902859762777, 0.552412015007576, 0.5950264070124831, 0.6470643689972349]s
With 100 methods, took [0.9766806430125143, 0.6893237390031572, 0.6373719449911732, 0.6113516429904848, 0.6429122459958307]s
With 1000 methods, took [1.4384041439916473, 1.735367571003735, 1.6085309459886048, 1.7646593420067802, 1.3692002349998802]s
With 10000 methods, took [9.481361449987162, 9.846626015001675, 9.705778863019077, 10.955383850989165, 11.169956739002373]s

So it does start to become noticeable at high method counts (or presumably other non-attribute fields), but for that to be meaningful it means dynamically creating many classes with many methods, which doesn't seem like a typical use case.

As far as object creation or usage, changes to the class only occur when there are cached_propertys, so it should be completely unimpacted.

@diabolo-dan diabolo-dan force-pushed the support_for_cached_properties_in_slotted_classes branch from c1b3a9e to b0d87e3 Compare November 17, 2023 15:47
@hynek
Copy link
Member

hynek commented Nov 18, 2023

You're missing (branch) coverage which made the CI red:

Hmmm, It looks like coverage data is only being uploaded for 3.7,3.10, and 3.11.

The branch that coverage is complaining about only gets hit in 3.8/3.9 because __annotation__ are ~always set in 3.10+, and the code is flagged behind 3.8

I've pushed a commit to maintain comparable behaviour which I believe should resolve the coverage problem in any case.

You can absolutely change coverage to be measured on 3.9 too btw – no need to do any crazy hacks around that (disclaimer: I haven't looked at what you've done – just stating).

So it does start to become noticeable at high method counts (or presumably other non-attribute fields), but for that to be meaningful it means dynamically creating many classes with many methods, which doesn't seem like a typical use case.

As far as object creation or usage, changes to the class only occur when there are cached_propertys, so it should be completely unimpacted.

Yeah this is absolutely OK, thank you for humoring me.

@Tinche Tinche self-requested a review November 25, 2023 18:50
" _setter(item, result)",
" return result",
" if '__attrs_original_getattr__' in vars(__class__):",
" return __class__.__attrs_original_getattr__(self, item)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have to do this check here? Can this be checked at class construction time and passed as a parameter to _make_cached_property_getattr? Might make the __class__ global unnecessary too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the check out to construction, and passed in the original_getattr to avoid the class lookup.

__class__ is still needed in the closure for the super() call to work.

I've removed __attrs_original_getattr__ on the class, as a separate MR, because it needed some other changes for it goes through the __class__ closure reference updating, which has it's own trade-off.

" return result",
" if '__attrs_original_getattr__' in vars(__class__):",
" return __class__.__attrs_original_getattr__(self, item)",
" if hasattr(super(), '__getattr__'):",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do this at construction, because a __getattr__ might exist only on a different super-class of a sub-class. We could optimise for the case that it exist with a try/except on the attribute access instead of the if, since I guess the AttributeError caused by a non-existant __getattr__ isn't the most critical thing to be fast.

For reference in case it influences the decision:

echo 'with f, hasattr'
python -m timeit --setup='\
class A():
    def f(self): ...
class B(A): ...
b = B()' '\
if hasattr(super(B, b), "f"):
    f = super(B, b).f
'


echo 'with f, try/except'
python -m timeit --setup='\
class A():
    def f(self): ...
class B(A): ...
b = B()' '\
try:
    f = super(B, b).f
except AttributeError:
    f = None'

echo 'without f, hasattr'
python -m timeit --setup='\
class A(): ...
class B(A): ...
b = B()' '\
if hasattr(super(B, b), "f"):
    f = super(B, b).f
f = None'


echo 'without f, try/except'
python -m timeit --setup='\
class A(): ...
class B(A): ...
b = B()' '\
try:
    f = super(B, b).f
except AttributeError:
    f = None'
with f, hasattr
1000000 loops, best of 5: 258 nsec per loop
with f, try/except
2000000 loops, best of 5: 122 nsec per loop
without f, hasattr
500000 loops, best of 5: 592 nsec per loop
without f, try/except
500000 loops, best of 5: 914 nsec per loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand the case we're guarding against here.

I don't think we can do this at construction, because a __getattr__ might exist only on a different super-class of a sub-class.

The situation you're describing, as I understand it, is this class sandwich:

@define
class A:
    @cached_property
    def a(self) -> int:
        return 1


class B(A):
    def __getattr__(self, item: str):
        if item == "b":
            return 1
        return super().__getattr__(item)


@define
class C(B):
    @cached_property
    def b(self) -> int:
        return 2

Let me know if any of my reasoning is faulty.

The __getattr__ on C needs to call super(), but we can tell at class construction time.

The __getattr__ on A doesn't need to call super(), and super() wouldn't go to B anyway, since B isn't a superclass of A.

So we should be able to do this check at class construction time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, should have been clearer, I was thinking of something along the lines of:

@attrs.frozen
class A:
    a: int
    @cached_property
    def b(self) -> int:
        return self.a + 1

class Mixin:
    __slots__ = ()
    def __getattr__(self, item: str) -> int:
        if item ==  'c':
            return 1
        return super().__getattr__(item)

@attrs.frozen
class C(A, Mixin):
    ...

C(1).c

the __getattr__ on A needs to call super so that it passes to the __getattr__ on Mixin, but can't know about it's existence at construction time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I suspected multiple inheritance might be the thing complicating our life here ;)

src/attr/_make.py Outdated Show resolved Hide resolved
@diabolo-dan
Copy link
Contributor Author

You can absolutely change coverage to be measured on 3.9 too btw – no need to do any crazy hacks around that (disclaimer: I haven't looked at what you've done – just stating).

Fair enough. I don't think it's a hack, just a slightly different approach, but the code change is in it's own commit, so if the prior option is preferred, it should be easy to revert (in which case I'll look into the github ci settings).

@diabolo-dan diabolo-dan force-pushed the support_for_cached_properties_in_slotted_classes branch from 9f6fd85 to bac209e Compare December 7, 2023 13:04
@diabolo-dan diabolo-dan requested a review from Tinche December 7, 2023 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants