-
Notifications
You must be signed in to change notification settings - Fork 974
Run cudf-polars conda unit tests with more than 1 process #19980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run cudf-polars conda unit tests with more than 1 process #19980
Conversation
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
/ok to test 59d144d |
As mentioned in #19895 (comment), we'll want to update cudf/python/cudf_polars/tests/test_config.py Line 104 in be76be8
somewhere near the end of that test. Otherwise, there will be a memory pool with 50% of GPU memory sitting around doing nothing. That should perhaps be a fixture so that we know it runs, even if the test fails somewhere after creating that memory resource. There might still be some risk in two tests running that at the same time, so that should perhaps rerun on failures a few times. I can push those changes here if you'd like. |
/ok to test 29b3b41 |
That one passed. @mroeschke do you remember how consistently you saw the test suite OOM previously? |
Each run was OOMing even with just in memory executor (before limiting the initial pool size to 1GB). I'm going to rerun these test in an "ideal" setup (8 processes, no limiting of initial pool size) to see if |
/ok to test 4c51a08 |
import pytest | ||
|
||
|
||
@pytest.fixture(autouse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put this fixture in the experimental directory since currently these are the only tests (with the CI script) that purposefully test with the distributed executor. Is that OK @TomAugspurger given your thoughts on reorganizing the cudf_polars test suite in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that sounds good to me.
import pytest | ||
|
||
|
||
@pytest.fixture(autouse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that sounds good to me.
Just documenting some runtime observations for a follow up running the cudf polars wheel tests with multiple processes
So in the follow up, I'll probably use 8 processes for these unit test with conda (and wheels) and maybe disable running the distributed variant with multiple processes |
/merge |
…9980) Now that cudf-polars uses managed memory by default, the prior comment here should no longer be applicable and we should be able to run these tests with more than 1 process for a hopeful improvement in runtime. Probably depends on rapidsai#20042 so each xdist process doesn't set the `initial_pool_size` of the memory resource to 80% of the available device memory. Authors: - Matthew Roeschke (https://github.com/mroeschke) - Tom Augspurger (https://github.com/TomAugspurger) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Bradley Dice (https://github.com/bdice) - Tom Augspurger (https://github.com/TomAugspurger) URL: rapidsai#19980
Description
Now that cudf-polars uses managed memory by default, the prior comment here should no longer be applicable and we should be able to run these tests with more than 1 process for a hopeful improvement in runtime.
Probably depends on #20042 so each xdist process doesn't set the
initial_pool_size
of the memory resource to 80% of the available device memory.Checklist