Skip to content

Add cache_method decorator #895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

chriseclectic
Copy link
Collaborator

@chriseclectic chriseclectic commented Aug 19, 2022

Summary

Adds a cache_method decorator that generalizes lru_cache for caching methods of class instances (since apparently lru_cache can have memory leaks when used in this way, if it works)

This is based on suggested solution for caching experiment methods in this comment

Details and comments

By default this decorator requires all method arg and kwarg values to be hashable, and they are included in the cache key for matching. However setting cache_args=False on the decorator will ignore args and kwargs and match only on the method name. Alternatively the decorator can be called with require_hashable=False which will allow non-hashable args while matching on all hashable args and kwargs.

Copy link
Collaborator

@nkanazawa1989 nkanazawa1989 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chris. This suggestion seems good direction, but still we need to allow for some flexibility. For example, current framework is missing the capability to check experiment options self.experiment_options, and to cache outcomes across instances. This is beyond the requirements by the tomography fitter, but we must provide flexible API so that #878 can update the mechanism based on their needs.



def cache_method(
cache: Union[Dict, str] = "_cache", cache_args: bool = True, require_hashable: bool = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The combination of two booleans is bit harder to understand. Perhaps string based behavior makes the interface more intuitive, such as first_time, hash_all, only_hashable. This will allow more flexibility for hashing mechanism. For example, another option we may want to have would be repr(arg) to make everything hashable, e.g. try_repr.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I find the booleans easier to understand than string values. I don't think we should add more flexibility in hashing mechanism. If anything maybe we should remove the require_hashable should be removed so the option is just to use all args (and they must be hashable) or none.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove that option perhaps this is only applicable to static methods? Because self of an experiment instance is not hashable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self not being hashed is the design of this decorator, and main reason why you need it instead of lru_cache. So it should only be used with regular methods, not static methods. For static methods you should be able to use a regular lru_cache without issues since it behaves like a regular function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS @nkanazawa1989 I will make a commit to remove the require_hashable kwarg to not overcomplicate this as you suggest.

@chriseclectic
Copy link
Collaborator Author

@nkanazawa1989 I think there is a bit of a misunderstanding, this decorator is intended to be a replacement for lru_cache that works properly with methods, and adds some more flexibility of whether to include arg values or not when matching cached values. It has nothing to directly do with caching in BaseExperiment, (but an implementation of caching of transpiled circuits and things in BaseExperiments could be implemented using this functionality eg like i wrote in the other PR comment).

@nkanazawa1989
Copy link
Collaborator

nkanazawa1989 commented Aug 22, 2022

If this is just a bug fix of python lru_cache then I think this code should go to that repository. Since the arguments of your cache function differs from the functools lru_cache, I thought you wanted some customization for our experiment. This means it doesn't stop us to add some flexibility for our experiment as long as it doesn't hurt the performance (e.g. formatting and inspection must be avoided). I think having flexibility of checking experiment options out of experiment instance (self) is something reasonable to do for our experiment module. Perhaps boolean still works to do this with check_experiment_options: bool=False.

@yaelbh
Copy link
Collaborator

yaelbh commented Aug 22, 2022

I'm not familiar with the tomography use case.

I'm aware of several use cases where we want to refrain from retranspiling. To this end, what we need is not a caching mechanism. Instead, it is sufficient to:

  1. Expose _transpile_circuits (i.e., rename it to transpile_circuits).
  2. Let run accept a transpiled_circuits input parameter.

@nkanazawa1989
Copy link
Collaborator

nkanazawa1989 commented Aug 22, 2022

Sounds like #878 can be simply closed in that case. I don't have any method in BaseExperiment and its subclasses that can be drastically improved with cache. Exposing transpiled circuit is not the scope of #878.

(edit)
This makes me think the PR is good to go as it is.

This supports caching regular methods of class instances with optionally support for including hashable arg values in cache key.
Define function for returning the method cache dict outside of the wrapped method so it doesn't need to be checked every method call.

def _cache_fn(instance, method):
# pylint: disable = unused-argument
name = method.__name__
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nkanazawa1989 I wonder if this should be method.__qualname__ instead? qualname includes the class name in the string like <cls_name>.<method_name>, rather than just <method_name>.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think qualname should be useful if we want to support class level cache in future. Also you can validate that the method is not a function.

@yaelbh
Copy link
Collaborator

yaelbh commented Aug 23, 2022

The proposal of #895 (comment) may require more thinking, because run does things before the transpilation that may be relevant to the transpilation.

Still, I'd like to revisit the caching mechanism that you're building here. You're putting effort in it, and we'll have to maintain it. What's the motivation? Is it justified? It seems to be beyond the scope of qiskit-experiments.

The reuse of transpiled circuits is a modest goal that can do with a small solution. Probably along the lines of #895 (comment), or something else of the same magnitude.

@nkanazawa1989
Copy link
Collaborator

This is a cache mechanism to store some internal state. In QE, some helper function (method) might be called multiple times thus sometimes it's better to cache them for performance. For example, a dedicated logic that caches the transpiled circuit doesn't speed up the analysis class, and we may need huge memory space if user generate circuits with multiple settings (if we do in-memory cache).

As Chris wrote in the PR comment, this is mainly to overcome memory leak issue in current python RLU cache (I don't know the details). I think currently Chris is looking into different approach.

@chriseclectic
Copy link
Collaborator Author

Closing this for #997

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants