-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a new marker to check for memory leaks #52
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should document this in the readme somehow too.
I have added docs |
a3864b0
to
daa4374
Compare
9e550f5
to
466743b
Compare
Just tried 466743b and I'm getting the warning:
$ pip freeze
....
pytest-memray @ git+https://github.com/pablogsal/pytest-memray@466743b4853261a35c3ff84d4d4654d6586918a1
... Update: $ pytest --markers Doesn't show the markers, but $ pytest --memray --markers Does. Compared with hypothesis, or other extensions which show it regardless. This is separate (or by design) to this PR. |
@tonybaloney Yep, and we have #49 for that already. |
Other than needing to provide |
My tests fail because memray writes to stderr on macOS with a warning (see bloomberg/memray#254) but other than that, it seems to work when I tried it. I'm going to purposefully write a leaky test and verify that it catches it next |
Apart from that, does it work for your use case? Can you simulate a leak in your extension and try the marker? |
I tried adding a leaking String object to the PyObject* Logger_repr(Logger *self) {
std::string level = _getLevelName(self->effective_level);
PyObject* leaky_object = PyUnicode_FromString("Hello"); /* TEST */
return PyUnicode_FromFormat("<Logger '%U' (%s)>", self->name, level.c_str());
} Running memray by hand using a script that instantiates an instance of the extension type (100,000 times), I can see the leak in the graph: This is what I'm currently doing. But using this marker doesn't seem to catch the same thing? @pytest.mark.parametrize("level", levels)
@pytest.mark.check_leaks("0B", warmups=3)
def test_logger_repr(level):
logger = picologging.Logger("test", level)
assert repr(logger) == f"<Logger 'test' ({level_names[levels.index(level)]})>" Doesn't give me any information about leaks, every test seems to leak at least something, even ones not testing my extension module:
I guess I could run each test and work out how much memory it should consume and then set the limit manually? |
Well that’s because something is being leaked indeed, but is not what you are hunting for. The marker will complain about anything created in the test that survives the test. In your flamegraph for leaks you can see many more objects. So it would not be surprising that there is a lot more stuff being kept alive other than the string. That’s precisely why we take the minimum memory allowed to leak, so you can filter out that stuff. |
what information are you expecting? |
I was expecting it to report that the
|
Humm, to me this looks right. Is telling you that your C extension type is leaking memory. I don’t see why you say that is not giving you any information about the leak. by default the marker doesn’t run in native mode so that’s why it doesn’t show anything below that. It tells you the lower Python frame that called something that leak. As it cannot see below, it cannot split that into the different calls that you see in the flamegraph but is telling you that is leaking which is the main pour pose of the marker. to debug the leak you can use the profiler directly which allows you to use more powerful visualisations |
The marker doesn’t run in native mode (for performance reasons as that’s quite heavy for a test suite) so it won’t show you any C code. Does that explain what’s going on? |
Ok, that makes sense then. Is there a way to run it in native mode? The settings I use on the CLI are: $ PYTHONMALLOC=malloc memray run --trace-python-allocators -o .profiles/memray_logger.py.bin -f --native memray_logger.py
$ memray flamegraph --leaks -f .profiles/memray_logger.py.bin |
The idea of the marker is that it will complain that there are leaks and then will show you the Python call that leaked. If you need to investigate with native code you can do it later with the profiler. Or at least that’s the idea. the marker is not a substitute for the profiler, is a way to ensure that you are not leaking and it will tell you sonemething about the Python call that leaked but then you should investigate with better reporters. |
Ok, then yes, this will serve the requested use case in #45 I'll test it with more scenarios. |
we could add a way to run in native mode, but that would make the happy path (the test passing) extra slow for no reason. As failures are the rare case, we are prioritising detection over exhaustive reports and leaving the manual execution of the profiler for the later. |
@pablogsal and I were going back and forth about this on Slack for quite a while today, so I'm feeling a bit vindicated 😛 |
I understand it would be really slow, but handy as an optional flag |
60c08f0
to
cf05664
Compare
@godlygeek This is ready for another round |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a few typos as I was reading this
712b65d
to
e261803
Compare
Users have indicated that it will be very useful if the plugin exposes a way to detect memory leaks in tests. This is possible, but is a bit tricky as the interpreter can allocate memory for internal caches, as well as user functions. To make this more reliable, the new marker will take two parameters: * The limit of memory per location to consider an allocation. If the memory leaked by any allocation location in the test is higher than this value, the test will fail. * An optional callable function that can be used to filter out locations. This will allow users to remove false positives. Signed-off-by: Pablo Galindo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a lot cleaner. I've taken a first pass at reviewing this and I still see some aspects of the API that I think we need to polish. I've focused only on how we describe failures, how we describe the functionality, and how users hook their own filtering into it so far.
I've pushed another fixup. It adds another Honestly, I think we should probably change that to |
Signed-off-by: Matt Wozniski <[email protected]>
OK. I've added another fixup commit getting this into a state where I'm happy to land it. I changed a fair amount of stuff, but most of it is pretty minor. The biggest changes were to the documentation.
Sorry that this turned out to be a lot of changes, but I hope most of them will be pretty uncontroversial. |
https://godlygeek.github.io/pytest-memray/usage.html and https://godlygeek.github.io/pytest-memray/api.html show the rendered documentation after my changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After my (extensive 😓) changes, this LGTM. If you're happy with it, feel free to squash and land. If not, let me know what I screwed up and I'll be happy to fix it.
Users have indicated that it will be very useful if the plugin exposes a
way to detect memory leaks in tests. This is possible, but is a bit
tricky as the interpreter can allocate memory for internal caches, as
well as user functions.
To make this more reliable, the new marker will take two parameters:
The watermark of memory to ignore. If the memory leaked by the test is
higher than this value, the test will fail and it will pass otherwise.
The number of warmup runs. This allows to run the test multiple times
(assuming it passes) before actually checking for leaks. This allows
to warmup user and interpreter caches.
Closes: #45