Add permutation test #726

maltekuehl · 2025-03-18T11:42:07Z

PR Checklist

Referenced issue is linked
If you've fixed a bug or added code that should be tested, add tests!
Documentation in docs is updated

Description of changes

Adds a PermutationTest to the tools submodule, similar to TTest and WilcoxonTest.

Usage:

result = PermutationTest.compare_groups(
    pdata,
    column="group",
    baseline="A",
    groups_to_compare=["B"],
    test=pertpy.tools.WilcoxonTest, # optional, defaults to WilcoxonTest
    n_permutations=100, # optional, defaults to 100
    seed=42, # optional, defaults to 0
)

Technical details

Currently no specific reference p-values for other permutation tests to compare to exist and the standard Wilcoxon values from R deviate from the results of the PermutationTest. However, there is full agreement regarding which genes are significant and I have adapted the test to check for this.
Needed to reimplement compare_groups to have the number of permutations and the test to use after permutation as explicit parameters and to parallelize processing.
Users need to provide the test they want to use after permutation themselves, if they don't want to use the standard WilcoxonTest.

Additional context

Part of the scverse x owkin hackathon.

…nt with both TTest and Wilcoxontest and add seed

Zethson · 2025-03-18T14:31:13Z

Cool! Let me know when you want me or Gregor to have a look, please.

maltekuehl · 2025-03-18T14:38:30Z

@Zethson @grst based on the tests I ran locally, this version should now pass. However, there seems to be a docs issue unrelated to my code changes (see other recent PRs), because a dependency cannot be installed. So from my side, you can go ahead and take a look already and let me know if anything needs to be adapted.

Another idea that I had was that it would be interesting to be able to compare any values in obs and obsm as well. One use case would be if you have a spatial transcriptomics image for each sample within a group for which you can calculate Moran's I at the sample level (for each gene or a single gene of interest). You may want to store this data not in its own pdata but rather in the metadata, so flexibilizing the compare_groups function to not be restricted to pdata.var_names for the variables that are compared would be a nice addition.

codecov-commenter · 2025-03-18T14:44:56Z

Codecov Report

Attention: Patch coverage is 90.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 72.03%. Comparing base (3087d4e) to head (8ae69ce).
Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
...ols/_differential_gene_expression/_simple_tests.py	88.88%	4 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #726       +/-   ##
===========================================
+ Coverage   56.10%   72.03%   +15.92%     
===========================================
  Files          47       48        +1     
  Lines        6149     5507      -642     
===========================================
+ Hits         3450     3967      +517     
+ Misses       2699     1540     -1159

Files with missing lines	Coverage Δ
pertpy/tools/__init__.py	`86.11% <100.00%> (-11.04%)`	⬇️
...py/tools/_differential_gene_expression/__init__.py	`100.00% <100.00%> (ø)`
...y/tools/_differential_gene_expression/_pydeseq2.py	`92.30% <100.00%> (ø)`
...ols/_differential_gene_expression/_simple_tests.py	`93.51% <88.88%> (-2.87%)`	⬇️

... and 20 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Zethson

Cool!

So name wise, there's some overlap with https://pertpy.readthedocs.io/en/stable/usage/tools/pertpy.tools.DistanceTest.html#pertpy.tools.DistanceTest which also kind of labels itself as a permutation test but in different way. We also have to resolve this naming clash because the DistanceTest is currently under the title "distances and permutation test" which I consider an issue. We should only have this label once or be more specific.

pertpy/tools/_differential_gene_expression/_simple_tests.py

Zethson · 2025-03-18T15:06:37Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

+            fit_kwargs: Additional kwargs passed to the test function.
+            test_kwargs: Additional kwargs passed to the test function.
+        """
+        if len(fit_kwargs):


I don't get the point of this. Is it required? Can this be fixed upstream aka in the interface by making this optional to have?

This is inherited from the base class and not a new introduction of this PR. MethodsBase is also used for linear models which require the fit_kwargs. I will change the docstring to indicate that these are not used for the simple tests.

Zethson · 2025-03-18T15:17:22Z

If you merge main into this, the RTD job will work again.

grst · 2025-03-18T15:20:41Z

@maltekuehl, could the paralellization you implemented be pushed up to the abstract base class? Then also wilcoxon and ttest would benefit from it. In that case, would it even be necessary to re-implement the interfaces specifically for the permutation test, or could you just use an extension class?

maltekuehl · 2025-03-18T15:24:13Z

@grst good idea to push this upstream. The reason I had to recreate the compare_groups function was that I wanted to explicitly expose the seed, test and n_permutations parameters, as these are key to how the permutation test works and should imo not just be passed through test_kwargs without further documentation. We could however add these as unsued parameters for the other classes to the base class or we could move the functionality of compare_groups to a helper function and then just overwrite the call to that helper with an update of the test_kwargs. This would however mean that we would have to have essentially the same parameters for both the function and the helper, leading perhaps to unnecessary code duplication. What are your thoughts?

…tation arguments, passing others through kwargs

maltekuehl · 2025-03-18T16:23:33Z

@Zethson what would you suggest naming wise? The docs mention a pertpy.tools.PermutationTest but this does not actually seem to be implemented in the code, hence I went with this name. However, the docs should be updated, but that relates to the distance functionality which you are more familiar with. The docstring for DistanceTest also mentions permutation tests but that could perhaps be rephrased to "Monte-Carlo simulation" or be specified in some other way? For now, I however think that outside the non-existing (or only now created) function in the docs that there is little potential for confusion.

…ided

…turning statistic from tests and fix bug where the permutation_test was not applied

grst · 2025-03-19T07:07:02Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

+        elif permutation_test is None and cls.__name__ == "PermutationTest":
+            logger.warning("No permutation test specified. Using WilcoxonTest as default.")
+
+        comparison_indices = [_get_idx(column, group_to_compare) for group_to_compare in groups_to_compare]


If I get this right, this implements parallelism only at the level of comparisons. This means, if there's only one comparison, there would be no benefit of parallelization. I think it would be more beneficial to implement parallelism at the level of variables, i.e. in _compare_single_group at for var in tqdm(self.adata.var_names):

The permutation_test also has a vectorize attribute which allows to pass entire matrices to the test. This likely also speeds up testing quite a bit when whole blocks of data are processed together. Maybe we even get implicit parallelism through the BLAS backend of numpy when doing so.

It probably requires a bit of testing to find out what's faster. Per-variable paralleism, just passing the entire array into stats.permutation_test(..., vectorized=True) or splitting the data in chunks and then passing it to permutation_test(..., vectorized=True).

grst · 2025-03-19T07:13:07Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

        if paired:
-            return scipy.stats.wilcoxon(x0, x1, **kwargs).pvalue
+            return scipy.stats.wilcoxon(x0, x1, **kwargs).__getattribute__(return_attribute)


I think there could be value in returning multiple attributes, not just the p-value. In particular for the permutation test, but also for the t-test it would be nice to have the t-statistics alongside the p-value. I therefore suggest to change the function signature of _test to returning a dictionary or named tuple instead.

This can then be included in https://github.com/scverse/pertpy/pull/726/files#diff-5892917e4e62a1165dda9ac148f802a12e3a95735a367b5e1bf771cb228fcd0dR86 as

res.append({"variable": var, "log_fc": np.log2(mean_x1) - np.log2(mean_x0), **res})

grst · 2025-03-19T07:16:53Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

+                "The `test` argument cannot be `PermutationTest`. Use a base test like `WilcoxonTest` or `TTest`."
+            )
+
+        def call_test(data_baseline, data_comparison, axis: int | None = None, **kwargs):


do you really need another test here? Essentially the statistic we want to test is the fold change.

The advantage of permutation tests is their generalizability without (almost) any assumptions, as long as the test statistic is related to the null hypothesis we want to test. I would thus view the ability to use any statistic, either those that are already implemented in pertpy or any callable that accepts two NDArrays and **kwargs as core functionality. With the latest update, this PR supports any statistic, e.g., it would also be trivial to use a comparison of means, of medians or any other function that you can implement in < 5 lines of numpy code with the PermutationTest. I opted for Wilcoxon statistic as a default because the ranksum is fairly general and it's something that's already implemented in pertpy. Of course, we could also add an explicit collection of other statistics but it could never cover all use cases and defining this statistic should be part of the thought process when a user uses the permutation test, so I'm not convinced of the value and necessity of covering this as part of the library itself.

@grst is this resolved?

We changed this to the t-statistic after in person discussion, as the wilcoxon statistic would just be the wilcoxon test and from what I understood from the in person discussion the rationale for keeping this function was agreed to, but perhaps @grst can confirm again.

I still don't get why you would want to use a permutation test with a test statistics. If I'm interested in the difference in means, I would use the difference in means as statistics.

In my understanding, the whole point of using Wilcoxon or T test is that one can compare against a theoretical distribution, avoiding permutations in the first place.

In any case, I would prefer passing a simple lambda over accepting another pertpy test class. This could be a simple

test_statistic: Callable[[np.ndarray, np.ndarray], float] = lambda x,y: np.log2(np.mean(x)) - np.log2(np.mean(y))

or if you really want to use a t-statistics

test_statistic = lambda x, y: scipy.stats.ttest_ind(x, y).statistic

Zethson

A few more minor requests, please.

Zethson · 2025-04-04T10:02:24Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

        fit_kwargs: Mapping = MappingProxyType({}),
        test_kwargs: Mapping = MappingProxyType({}),
    ) -> DataFrame:
+        """Perform a comparison between groups.
+
+        Args:


Please remove all typing from the docstring here. We only type the function signature which is enough for Sphinx.

Zethson · 2025-04-04T10:02:44Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

+            n_permutations (int): Number of permutations to perform if a permutation test is used.
+            permutation_test_statistic (type[SimpleComparisonBase] | None): Test to use after permutation if a
+                permutation test is used.
+            fit_kwargs (Mapping): Not used for simple tests.


Not the most informative docstring. Can we make that clearer, please?

pertpy/tools/_differential_gene_expression/_simple_tests.py

Zethson · 2025-04-04T10:04:41Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

+
+        This function relies on another test (e.g. WilcoxonTest) to generate a test statistic for each permutation.
+
+        .. code-block:: python


Pretty sure that we're not using codeblocks but a different syntax for the examples. Could you please harmonize it here? I also think that the examples always come last in the docstring.

Zethson · 2025-04-04T10:06:36Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

+                "The `test` argument cannot be `PermutationTest`. Use a base test like `WilcoxonTest` or `TTest`."
+            )
+
+        def call_test(data_baseline, data_comparison, axis: int | None = None, **kwargs):


@grst is this resolved?

Zethson · 2025-04-28T00:30:03Z

@grst @maltekuehl what's the status of this PR now?

grst · 2025-04-28T05:58:16Z

I still need to give it a proper review, didn't have the time yet

maltekuehl · 2025-04-28T12:44:58Z

@Zethson from my side all of your comments are addressed but waiting on the comments from @grst. Hadnt checked back because I'm on holiday but can make any necessary changes next week.

grst · 2025-04-29T07:45:04Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

@@ -71,16 +73,16 @@ def _compare_single_group(
            x0 = x0.tocsc()
            x1 = x1.tocsc()

-        res = []
+        res: list[dict[str, float | np.ndarray]] = []


How could this be a ndarray? Doesn't _test always return a float?

grst · 2025-04-29T08:24:59Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

+        n_permutations: int = 1000,
+        permutation_test_statistic: type["SimpleComparisonBase"] | None = None,


I don't think this should be in the function signature of the base class.

I remember that we had a discussion about this with respect to the usability, i.e. you didn't want this to be hidden in test_kwargs. This is fair, but here are two alternative ways this could be achieved:

instead of test_kwargs: Mapping, pass a **kwargs to compare_groups which will be passed on to the test function

Instead of using classmethods/staticmethods, go with an OOP approach. Test-specific parameters to into the constructor. The usage would become
-res = TTest.compare_groups(adata, "A", "B") +res = TTest().compare_groups(adata, "A", "B") -res = PermuationTest.compare_groups(adata, "A", "B", n_permutations=1000) +res = PermuationTest(n_permutations=1000).compare_groups(adata, "A", "B")

grst · 2025-04-29T08:30:22Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

+        if permutation_test_statistic:
+            test_kwargs = dict(test_kwargs)
+            test_kwargs.update({"test_statistic": permutation_test_statistic, "n_permutations": n_permutations})
+        elif permutation_test_statistic is None and cls.__name__ == "PermutationTest":
+            logger.warning("No permutation test statistic specified. Using TTest statistic as default.")
+


With either solution mentioned above, this wouldn't be necessary here.

grst · 2025-04-29T08:51:55Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

+                "The `test` argument cannot be `PermutationTest`. Use a base test like `WilcoxonTest` or `TTest`."
+            )
+
+        def call_test(data_baseline, data_comparison, axis: int | None = None, **kwargs):


I still don't get why you would want to use a permutation test with a test statistics. If I'm interested in the difference in means, I would use the difference in means as statistics.

In my understanding, the whole point of using Wilcoxon or T test is that one can compare against a theoretical distribution, avoiding permutations in the first place.

In any case, I would prefer passing a simple lambda over accepting another pertpy test class. This could be a simple

test_statistic: Callable[[np.ndarray, np.ndarray], float] = lambda x,y: np.log2(np.mean(x)) - np.log2(np.mean(y))

or if you really want to use a t-statistics

test_statistic = lambda x, y: scipy.stats.ttest_ind(x, y).statistic

grst · 2025-04-29T08:55:47Z

pertpy/tools/_differential_gene_expression/_simple_tests.py

@@ -33,7 +35,7 @@ def fdr_correction(
 class SimpleComparisonBase(MethodBase):
    @staticmethod
    @abstractmethod
-    def _test(x0: np.ndarray, x1: np.ndarray, paired: bool, **kwargs) -> float:
+    def _test(x0: np.ndarray, x1: np.ndarray, paired: bool, **kwargs) -> dict[str, float]:
        """Perform a statistical test between values in x0 and x1.

        If `paired` is True, x0 and x1 must be of the same length and ordered such that


Please document the return type, e.g.

Returns: A dictionary metric -> value. This allows to return values for different metrics (e.g. p-value + test statistic).

maltekuehl added 3 commits March 18, 2025 12:35

Add permutation test

f99e5ac

Fix test kwargs update

66f228e

Add n_jobs argument and change test to check for significance agreeme…

74e3ec1

…nt with both TTest and Wilcoxontest and add seed

Zethson requested changes Mar 18, 2025

View reviewed changes

maltekuehl and others added 3 commits March 18, 2025 16:28

Merge branch 'scverse:main' into main

4fdd140

Simplify and generalize compare_groups by adding most important permu…

64b7114

…tation arguments, passing others through kwargs

Merge branch 'main' of https://github.com/complextissue/pertpy

af7ca5d

maltekuehl and others added 3 commits March 18, 2025 17:30

Make permutation_test argument optional but raise warning if not prov…

325f1db

…ided

Merge branch 'main' into main

25d1689

Merge branch 'main' of https://github.com/complextissue/pertpy

1b6c65e

maltekuehl requested a review from Zethson March 18, 2025 16:34

maltekuehl added 2 commits March 18, 2025 22:03

Make test case a bit stricter again for significant values, enable re…

3e81976

…turning statistic from tests and fix bug where the permutation_test was not applied

Remove unnecessary import

14736b3

grst requested changes Mar 19, 2025

View reviewed changes

maltekuehl added 3 commits March 19, 2025 10:22

Remove parallelization and return statistic and p-value everywhere

5873b87

Remove parallelization and return statistic and p-value everywhere

676b4f0

Remove parallelization and return statistic and p-value everywhere

442b603

Zethson reviewed Apr 4, 2025

View reviewed changes

Fix docstring and examples of permutation test

8ae69ce

grst self-requested a review April 12, 2025 19:00

grst requested changes Apr 29, 2025

View reviewed changes

grst reviewed Apr 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add permutation test #726

Add permutation test #726

maltekuehl commented Mar 18, 2025 •

edited

Loading

Zethson commented Mar 18, 2025

maltekuehl commented Mar 18, 2025 •

edited

Loading

codecov-commenter commented Mar 18, 2025 •

edited

Loading

Zethson left a comment

Zethson Mar 18, 2025

maltekuehl Mar 18, 2025

Zethson commented Mar 18, 2025

grst commented Mar 18, 2025

maltekuehl commented Mar 18, 2025

maltekuehl commented Mar 18, 2025

grst Mar 19, 2025

grst Mar 19, 2025

grst Mar 19, 2025

grst Mar 19, 2025

grst Mar 19, 2025

maltekuehl Mar 19, 2025

Zethson Apr 4, 2025

maltekuehl Apr 5, 2025

grst Apr 29, 2025

Zethson left a comment

Zethson Apr 4, 2025

Zethson Apr 4, 2025

Zethson Apr 4, 2025

Zethson Apr 4, 2025

Zethson commented Apr 28, 2025

grst commented Apr 28, 2025

maltekuehl commented Apr 28, 2025

grst Apr 29, 2025

grst Apr 29, 2025

grst Apr 29, 2025

grst Apr 29, 2025

grst Apr 29, 2025


		This function relies on another test (e.g. WilcoxonTest) to generate a test statistic for each permutation.

		.. code-block:: python

		n_permutations: int = 1000,
		permutation_test_statistic: type["SimpleComparisonBase"] \| None = None,

Add permutation test #726

Are you sure you want to change the base?

Add permutation test #726

Conversation

maltekuehl commented Mar 18, 2025 • edited Loading

Zethson commented Mar 18, 2025

maltekuehl commented Mar 18, 2025 • edited Loading

codecov-commenter commented Mar 18, 2025 • edited Loading

Codecov Report

Zethson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zethson commented Mar 18, 2025

grst commented Mar 18, 2025

maltekuehl commented Mar 18, 2025

maltekuehl commented Mar 18, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zethson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zethson commented Apr 28, 2025

grst commented Apr 28, 2025

maltekuehl commented Apr 28, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maltekuehl commented Mar 18, 2025 •

edited

Loading

maltekuehl commented Mar 18, 2025 •

edited

Loading

codecov-commenter commented Mar 18, 2025 •

edited

Loading