-
Notifications
You must be signed in to change notification settings - Fork 30
Add permutation test #726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add permutation test #726
Conversation
…nt with both TTest and Wilcoxontest and add seed
Cool! Let me know when you want me or Gregor to have a look, please. |
@Zethson @grst based on the tests I ran locally, this version should now pass. However, there seems to be a docs issue unrelated to my code changes (see other recent PRs), because a dependency cannot be installed. So from my side, you can go ahead and take a look already and let me know if anything needs to be adapted. Another idea that I had was that it would be interesting to be able to compare any values in |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #726 +/- ##
===========================================
+ Coverage 56.10% 72.03% +15.92%
===========================================
Files 47 48 +1
Lines 6149 5507 -642
===========================================
+ Hits 3450 3967 +517
+ Misses 2699 1540 -1159
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
So name wise, there's some overlap with https://pertpy.readthedocs.io/en/stable/usage/tools/pertpy.tools.DistanceTest.html#pertpy.tools.DistanceTest which also kind of labels itself as a permutation test but in different way. We also have to resolve this naming clash because the DistanceTest is currently under the title "distances and permutation test" which I consider an issue. We should only have this label once or be more specific.
fit_kwargs: Additional kwargs passed to the test function. | ||
test_kwargs: Additional kwargs passed to the test function. | ||
""" | ||
if len(fit_kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the point of this. Is it required? Can this be fixed upstream aka in the interface by making this optional to have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inherited from the base class and not a new introduction of this PR. MethodsBase is also used for linear models which require the fit_kwargs. I will change the docstring to indicate that these are not used for the simple tests.
If you merge main into this, the RTD job will work again. |
@maltekuehl, could the paralellization you implemented be pushed up to the abstract base class? Then also wilcoxon and ttest would benefit from it. In that case, would it even be necessary to re-implement the interfaces specifically for the permutation test, or could you just use an extension class? |
@grst good idea to push this upstream. The reason I had to recreate the |
…tation arguments, passing others through kwargs
@Zethson what would you suggest naming wise? The docs mention a |
…turning statistic from tests and fix bug where the permutation_test was not applied
elif permutation_test is None and cls.__name__ == "PermutationTest": | ||
logger.warning("No permutation test specified. Using WilcoxonTest as default.") | ||
|
||
comparison_indices = [_get_idx(column, group_to_compare) for group_to_compare in groups_to_compare] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I get this right, this implements parallelism only at the level of comparisons. This means, if there's only one comparison, there would be no benefit of parallelization. I think it would be more beneficial to implement parallelism at the level of variables, i.e. in _compare_single_group
at for var in tqdm(self.adata.var_names):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The permutation_test also has a vectorize
attribute which allows to pass entire matrices to the test. This likely also speeds up testing quite a bit when whole blocks of data are processed together. Maybe we even get implicit parallelism through the BLAS backend of numpy when doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably requires a bit of testing to find out what's faster. Per-variable paralleism, just passing the entire array into stats.permutation_test(..., vectorized=True)
or splitting the data in chunks and then passing it to permutation_test(..., vectorized=True)
.
if paired: | ||
return scipy.stats.wilcoxon(x0, x1, **kwargs).pvalue | ||
return scipy.stats.wilcoxon(x0, x1, **kwargs).__getattribute__(return_attribute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there could be value in returning multiple attributes, not just the p-value. In particular for the permutation test, but also for the t-test it would be nice to have the t-statistics alongside the p-value. I therefore suggest to change the function signature of _test
to returning a dictionary or named tuple instead.
This can then be included in https://github.com/scverse/pertpy/pull/726/files#diff-5892917e4e62a1165dda9ac148f802a12e3a95735a367b5e1bf771cb228fcd0dR86 as
res.append({"variable": var, "log_fc": np.log2(mean_x1) - np.log2(mean_x0), **res})
"The `test` argument cannot be `PermutationTest`. Use a base test like `WilcoxonTest` or `TTest`." | ||
) | ||
|
||
def call_test(data_baseline, data_comparison, axis: int | None = None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really need another test here? Essentially the statistic we want to test is the fold change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The advantage of permutation tests is their generalizability without (almost) any assumptions, as long as the test statistic is related to the null hypothesis we want to test. I would thus view the ability to use any statistic, either those that are already implemented in pertpy or any callable that accepts two NDArrays and **kwargs as core functionality. With the latest update, this PR supports any statistic, e.g., it would also be trivial to use a comparison of means, of medians or any other function that you can implement in < 5 lines of numpy code with the PermutationTest
. I opted for Wilcoxon statistic as a default because the ranksum is fairly general and it's something that's already implemented in pertpy. Of course, we could also add an explicit collection of other statistics but it could never cover all use cases and defining this statistic should be part of the thought process when a user uses the permutation test, so I'm not convinced of the value and necessity of covering this as part of the library itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grst is this resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We changed this to the t-statistic after in person discussion, as the wilcoxon statistic would just be the wilcoxon test and from what I understood from the in person discussion the rationale for keeping this function was agreed to, but perhaps @grst can confirm again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't get why you would want to use a permutation test with a test statistics. If I'm interested in the difference in means, I would use the difference in means as statistics.
In my understanding, the whole point of using Wilcoxon or T test is that one can compare against a theoretical distribution, avoiding permutations in the first place.
In any case, I would prefer passing a simple lambda over accepting another pertpy test class. This could be a simple
test_statistic: Callable[[np.ndarray, np.ndarray], float] = lambda x,y: np.log2(np.mean(x)) - np.log2(np.mean(y))
or if you really want to use a t-statistics
test_statistic = lambda x, y: scipy.stats.ttest_ind(x, y).statistic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more minor requests, please.
fit_kwargs: Mapping = MappingProxyType({}), | ||
test_kwargs: Mapping = MappingProxyType({}), | ||
) -> DataFrame: | ||
"""Perform a comparison between groups. | ||
|
||
Args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove all typing from the docstring here. We only type the function signature which is enough for Sphinx.
n_permutations (int): Number of permutations to perform if a permutation test is used. | ||
permutation_test_statistic (type[SimpleComparisonBase] | None): Test to use after permutation if a | ||
permutation test is used. | ||
fit_kwargs (Mapping): Not used for simple tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the most informative docstring. Can we make that clearer, please?
|
||
This function relies on another test (e.g. WilcoxonTest) to generate a test statistic for each permutation. | ||
|
||
.. code-block:: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure that we're not using codeblocks but a different syntax for the examples. Could you please harmonize it here? I also think that the examples always come last in the docstring.
"The `test` argument cannot be `PermutationTest`. Use a base test like `WilcoxonTest` or `TTest`." | ||
) | ||
|
||
def call_test(data_baseline, data_comparison, axis: int | None = None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grst is this resolved?
@grst @maltekuehl what's the status of this PR now? |
I still need to give it a proper review, didn't have the time yet |
@@ -71,16 +73,16 @@ def _compare_single_group( | |||
x0 = x0.tocsc() | |||
x1 = x1.tocsc() | |||
|
|||
res = [] | |||
res: list[dict[str, float | np.ndarray]] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could this be a ndarray? Doesn't _test
always return a float?
n_permutations: int = 1000, | ||
permutation_test_statistic: type["SimpleComparisonBase"] | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be in the function signature of the base class.
I remember that we had a discussion about this with respect to the usability, i.e. you didn't want this to be hidden in test_kwargs
. This is fair, but here are two alternative ways this could be achieved:
- instead of
test_kwargs: Mapping
, pass a**kwargs
tocompare_groups
which will be passed on to the test function - Instead of using classmethods/staticmethods, go with an OOP approach. Test-specific parameters to into the constructor. The usage would become
-res = TTest.compare_groups(adata, "A", "B") +res = TTest().compare_groups(adata, "A", "B") -res = PermuationTest.compare_groups(adata, "A", "B", n_permutations=1000) +res = PermuationTest(n_permutations=1000).compare_groups(adata, "A", "B")
if permutation_test_statistic: | ||
test_kwargs = dict(test_kwargs) | ||
test_kwargs.update({"test_statistic": permutation_test_statistic, "n_permutations": n_permutations}) | ||
elif permutation_test_statistic is None and cls.__name__ == "PermutationTest": | ||
logger.warning("No permutation test statistic specified. Using TTest statistic as default.") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With either solution mentioned above, this wouldn't be necessary here.
"The `test` argument cannot be `PermutationTest`. Use a base test like `WilcoxonTest` or `TTest`." | ||
) | ||
|
||
def call_test(data_baseline, data_comparison, axis: int | None = None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't get why you would want to use a permutation test with a test statistics. If I'm interested in the difference in means, I would use the difference in means as statistics.
In my understanding, the whole point of using Wilcoxon or T test is that one can compare against a theoretical distribution, avoiding permutations in the first place.
In any case, I would prefer passing a simple lambda over accepting another pertpy test class. This could be a simple
test_statistic: Callable[[np.ndarray, np.ndarray], float] = lambda x,y: np.log2(np.mean(x)) - np.log2(np.mean(y))
or if you really want to use a t-statistics
test_statistic = lambda x, y: scipy.stats.ttest_ind(x, y).statistic
@@ -33,7 +35,7 @@ def fdr_correction( | |||
class SimpleComparisonBase(MethodBase): | |||
@staticmethod | |||
@abstractmethod | |||
def _test(x0: np.ndarray, x1: np.ndarray, paired: bool, **kwargs) -> float: | |||
def _test(x0: np.ndarray, x1: np.ndarray, paired: bool, **kwargs) -> dict[str, float]: | |||
"""Perform a statistical test between values in x0 and x1. | |||
|
|||
If `paired` is True, x0 and x1 must be of the same length and ordered such that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the return type, e.g.
Returns:
A dictionary metric -> value.
This allows to return values for different metrics (e.g. p-value + test statistic).
PR Checklist
docs
is updatedDescription of changes
Adds a
PermutationTest
to the tools submodule, similar toTTest
andWilcoxonTest
.Usage:
Technical details
compare_groups
to have the number of permutations and the test to use after permutation as explicit parameters and to parallelize processing.WilcoxonTest
.Additional context
Part of the scverse x owkin hackathon.