Skip to content

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Oct 19, 2025

Follow-up on #62727, resolving this for the remaining functions in the top-level namespace (and for some of the submodules, where it were only a few ones to address, i.e. pandas.testing and pandas.api.extensions)

xref #55178

@rhshadrach
Copy link
Member

@jorisvandenbossche - I've addressed the remaining functions.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm (assuming green)

@jorisvandenbossche
Copy link
Member Author

We get a doctest failure for one of the plotting functions that are touched:

______________________ [doctest] pandas.plotting.lag_plot ______________________
EXAMPLE LOCATION UNKNOWN, not showing all tests of that example
??? >>> pd.plotting.lag_plot(s, lag=1)
Expected:
    <Axes: xlabel='y(t)', ylabel='y(t + 1)'>
Got:
    <Axes: title={'center': 'width'}, xlabel='y(t)', ylabel='y(t + 1)'>

Which is not happening on main, and I also don't see any change in the environment. But I also don't directly see how this change is related ..

@rhshadrach
Copy link
Member

@jorisvandenbossche - this happens when we are ignoring some validation check. Need to update the path to the new one that is used in the model attribute. I can take a look later today.

@jorisvandenbossche
Copy link
Member Author

Ah yes. But, I also don't directly see any skips for doctests for plotting in general or lag_plot specifically?

@rhshadrach
Copy link
Member

rhshadrach commented Oct 20, 2025

Indeed, this is very weird. Setting the __module__ to pandas.plotting seems to add title={'center': 'width'} to the repr of plt.gca(). Setting it to pandas.foo does not modify the repr, whereas setting it to any other submodule of pandas (e.g. pandas.core) skips the test entirely. However this only seems to have an impact in the doctests; I am not seeing the repr change at all when running from a py file nor when running plain pytest.

I'm guessing there is something going odd with either pytest or matplotlib setup. The changing of the repr seems harmless and doesn't happen during normal runtime. Maybe just update the docstring?

@rhshadrach
Copy link
Member

rhshadrach commented Oct 20, 2025

Alright, this is on our end. The only thing set_module is doing is reordering the doctests. I'm finding that running the hist_frame doctest prior to lag_plot is the culprit.

Reproducer (whether the __module__ is set or not):

data = {
    "length": [1.5, 0.5, 1.2, 0.9, 3],
    "width": [0.7, 0.2, 0.15, 0.2, 1.1],
}
index = ["pig", "rabbit", "duck", "chicken", "horse"]
df = pd.DataFrame(data, index=index)
hist = df.hist(bins=3)

np.random.seed(5)
x = np.cumsum(np.random.normal(loc=1, scale=5, size=50))
s = pd.Series(x)
print(repr(pd.plotting.lag_plot(s, lag=1)))
# <Axes: title={'center': 'width'}, xlabel='y(t)', ylabel='y(t + 1)'>

@rhshadrach
Copy link
Member

rhshadrach commented Oct 20, 2025

A bit more clarity: it's the doctest of table that is calling fig, ax = plt.subplots(). This appears to clear the axis cache that matplotlib is using.

@rhshadrach
Copy link
Member

@jorisvandenbossche - fingers crossed this should be good. The issue was that :context: close-figs is only utilized via the sphinx extension and not used during doctests. I believe the resolution is to autouse our cleanup fixture (but only for doctests).

@rhshadrach rhshadrach added this to the 3.0 milestone Oct 21, 2025
Comment on lines 1954 to 1955
@pytest.fixture(autouse=True)
def mpl_cleanup(doctest_namespace):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just change that matplotlib example instead (or add cleanup code in the doctest?). I don't think this fixture in particularly cheap to run for every test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was thinking with doctest_namespace this would only be autoused for doctests. Thanks for catching this. One alternative is to have mpl_cleanup_doctest that is set to autouse=True with the first lines:

    if not isinstance(request.node, pytest.DoctestItem):
        return

This would at least make it cheap (~95ns on my machine), though still called for every test. And we'd still need mpl_cleanup that doesn't have autouse=True.

But I'm more in favor of reverting this and just supressing the output for this one test. It isn't great that one doctest can impact another (took a few hours to figure out what was going on here), but apparently is quite rare for it to have any negative side effects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm more in favor of reverting this and just suppressing the output for this one test

Yes I would be in favor of this too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative could also be to put (a copy of) that fixture in a conftest.py file in pandas/plotting directory, so it will (I think?) only be auto-used for tests (and thus only doctests) in that directory?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that @jorisvandenbossche - a few things come to mind.

  • This would not cover all docstrings with plotting, but I believe the majority of them.
  • pandas.plotting is public, so maybe we move _core.py and _misc.py into a _base (or some other name) sudirectory. This way the conftest submodule doesn't appear as public.
  • Would prefer adding mpl_cleanup to pandas._testing and then calling this function from the two fixtures.

return InteractiveShell(config=c)


@pytest.fixture
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this was accidentally deleted

@jorisvandenbossche
Copy link
Member Author

@rhshadrach thanks a lot for diving in the doctest failure .. ! Going to merge now as is, I think we should still consider using the fixture (as you mentioned, it is a very nasty issue to debug if it happens, even though that will be rare, if we have a way to prevent that time spent in the future)

@jorisvandenbossche jorisvandenbossche merged commit d81171b into pandas-dev:main Oct 22, 2025
42 checks passed
@jorisvandenbossche jorisvandenbossche deleted the set-module-remaining-functions branch October 22, 2025 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants