Skip to content

Conversation

@rhazn
Copy link
Contributor

@rhazn rhazn commented Jan 20, 2025

Closes #456

I had a look at providing a PR, but I am unsure if I broke more than I assumed or I am misunderstanding something. If I rerun the examples from the documentation, I get some different p-values (I updated the docs with my values in this PR) and I am seeing a test failure with just slightly mismatched values in tests/test_pairwise.py:435:

E       Max absolute difference among violations: 0.021
E       Max relative difference among violations: 0.04347826
E        ACTUAL: array([0.912, 0.96 , 0.462])
E        DESIRED: array([0.91 , 0.951, 0.483])```

I will be able to spend some more time on this in a few days, but until then I was wondering if this might be related to a setup issue from my side you can immediately spot and help me remove?

@codecov
Copy link

codecov bot commented Jan 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.54%. Comparing base (1c1c8bb) to head (8727773).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #457   +/-   ##
=======================================
  Coverage   98.54%   98.54%           
=======================================
  Files          19       19           
  Lines        3360     3360           
  Branches      492      492           
=======================================
  Hits         3311     3311           
  Misses         26       26           
  Partials       23       23           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@remrama
Copy link
Contributor

remrama commented Jan 20, 2025

Hi @rhazn, are you sure this is to close 45? Maybe that was a typo because 45 deals with ANOVAs and was closed years ago. Please edit the original post to match the relevant Issue if I'm right.

As far as the failing tests, when I make the same single change in L493 as you, I get the same output for all docs/tests as were originally there. Like, it doesn't require the small changes you added. But you passed 3.9 tests (only), so I guess those changes helped with that? I'm not sure fully what the deal is, but just letting you know what I found after a quick check. I'm using Python 3.11 on Windows for what that's worth.

@rhazn
Copy link
Contributor Author

rhazn commented Jan 21, 2025

Ah sorry, you are correct. I think I accidentally deleted a number, this PR is in reference to #456.

Sadly, I am not well versed in Python development so I might struggle more with this than average. I am working on OSX, when I run the docs examples with 3.11 in my REPL, I get the following output:

Python 3.11.11 (main, Jan 14 2025, 23:36:41) [Clang 19.1.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import pingouin as pg
>>> x = np.array([20, 22, 19, 20, 22, 18, 24, 20, 19, 24, 26, 13])
>>> y = np.array([38, 37, 33, 29, 14, 12, 20, 22, 17, 25, 26, 16])
>>> pg.wilcoxon(x, y, alternative='two-sided')
          W_val alternative     p_val       RBC      CLES
Wilcoxon   20.5   two-sided  0.288086 -0.378788  0.395833
>>> pg.wilcoxon(x - y)
          W_val alternative     p_val       RBC  CLES
Wilcoxon   20.5   two-sided  0.288086 -0.378788   NaN

Though with 3.9:

Python 3.9.21 (main, Jan 14 2025, 23:33:15) 
[Clang 19.1.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import pingouin as pg
>>> x = np.array([20, 22, 19, 20, 22, 18, 24, 20, 19, 24, 26, 13])
>>> y = np.array([38, 37, 33, 29, 14, 12, 20, 22, 17, 25, 26, 16])
>>> pg.wilcoxon(x, y, alternative='two-sided')
          W_val alternative     p_val       RBC      CLES
Wilcoxon   20.5   two-sided  0.285765 -0.378788  0.395833
>>> pg.wilcoxon(x - y)
          W_val alternative     p_val       RBC  CLES
Wilcoxon   20.5   two-sided  0.285765 -0.378788   NaN

Maybe this is something that is not good to discuss in a PR and should go into a Q&A, but it feels like I am making a mistake with Python venv that is preventing me from providing a good PR 😅 .

@remrama
Copy link
Contributor

remrama commented Jan 22, 2025

Hmm, I'm not sure what is going on between the Python versions either... 🤔

Copy link
Owner

@raphaelvallat raphaelvallat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I've fixed (= disabled) the unrelated failing tests in #460. I believe they were caused by differences in the p-value calculation in the scipy.stats.wilcoxon function starting from Scipy 1.14+. Merging this PR now.

@raphaelvallat raphaelvallat merged commit 65ae5e4 into raphaelvallat:main Mar 14, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Matched-pairs rank biserial correlation for the wilcoxon signed-rank test

3 participants