Skip to content

Conversation

ogrisel
Copy link
Contributor

@ogrisel ogrisel commented Sep 3, 2025

Follow-up on #13684.
Partially addresses #7022.

EDIT: in addition to using a 1 s timeout on CI (instead of 0.1 s), there was a problem with truncated stdout and stderr of the pytest subprocess. Flusing those files in Pytester.run seem to fix the problem.

EDIT 2: unfortunately, flushing is not enough. The test can still fail with truncated outputs on ubuntu-py314.

@ogrisel
Copy link
Contributor Author

ogrisel commented Sep 3, 2025

The failure on ubuntu-py314 looks like a real bug:

______________________________ test_timeout[True] ______________________________

pytester = <Pytester PosixPath('/tmp/pytest-of-runner/pytest-0/test_timeout0')>
enabled = True

    @pytest.mark.keep_ci_var
    @pytest.mark.parametrize("enabled", [True, False])
    def test_timeout(pytester: Pytester, enabled: bool) -> None:
        """Test option to dump tracebacks after a certain timeout.
    
        If faulthandler is disabled, no traceback will be dumped.
        """
        pytester.makepyfile(
            """
        import os, time
        def test_timeout():
            time.sleep(5 if "CI" in os.environ else 0.1)
        """
        )
        pytester.makeini(
            """
            [pytest]
            faulthandler_timeout = 0.01
            """
        )
        args = ["-p", "no:faulthandler"] if not enabled else []
    
        result = pytester.runpytest_subprocess(*args)
        tb_output = "most recent call first"
        if enabled:
            result.stderr.fnmatch_lines([f"*{tb_output}*"])
        else:
            assert tb_output not in result.stderr.str()
>       result.stdout.fnmatch_lines(["*1 passed*"])
E       Failed: nomatch: '*1 passed*'
E           and: '============================= test session starts =============================='
E           and: 'platform linux -- Python 3.14.0rc2, pytest-8.5.0.dev98+g16337bc94, pluggy-1.6.0'
E           and: 'rootdir: /tmp/pytest-of-runner/pytest-0/test_timeout0'
E           and: 'configfile: tox.ini'
E           and: 'collected 1 item'
E           and: ''
E           and: 'test_timeout.py '
E       remains unmatched: '*1 passed*'

/home/runner/work/pytest/pytest/testing/test_faulthandler.py:102: Failed
----------------------------- Captured stdout call -----------------------------
running: /home/runner/work/pytest/pytest/.tox/py314-coverage/bin/python -mpytest --basetemp=/tmp/pytest-of-runner/pytest-0/test_timeout0/runpytest-0
     in: /tmp/pytest-of-runner/pytest-0/test_timeout0
============================= test session starts ==============================
platform linux -- Python 3.14.0rc2, pytest-8.5.0.dev98+g16337bc94, pluggy-1.6.0
rootdir: /tmp/pytest-of-runner/pytest-0/test_timeout0
configfile: tox.ini
collected 1 item

test_timeout.py 
----------------------------- Captured stderr call -----------------------------
Timeout (0:00:00.010000)!
Thread 0x00007f9ce2ef0b80 [python] (most recent call first):
  File "<string>"
=================================== XPASSES ====================================
=========================== short test summary info ============================
FAILED testing/test_faulthandler.py::test_timeout[True] - Failed: nomatch: '*1 passed*'
    and: '============================= test session starts =============================='
    and: 'platform linux -- Python 3.14.0rc2, pytest-8.5.0.dev98+g16337bc94, pluggy-1.6.0'
    and: 'rootdir: /tmp/pytest-of-runner/pytest-0/test_timeout0'
    and: 'configfile: tox.ini'
    and: 'collected 1 item'
    and: ''
    and: 'test_timeout.py '
remains unmatched: '*1 passed*'
XPASS testing/_py/test_local.py::TestLocalPath::test_make_numbered_dir_multiprocess_safe - #11603
= 1 failed, 3792 passed, 118 skipped, 12 xfailed, 1 xpassed in 595.25s (0:09:55) =
py314-coverage: exit 1 (600.60 seconds) /home/runner/work/pytest/pytest> coverage run -m pytest pid=2433
  py314-coverage: FAIL code 1 (622.47=setup[21.87]+cmd[600.60] seconds)
  evaluation failed :( (622.54 seconds)
Error: Process completed with exit code 1.

Both the stdout and the stderr contents look truncated, and this causes the lack of expected match for "1 passed".

@ogrisel ogrisel changed the title TST increase faulthandler timeout value in test on CI TST attempt to fix faulthandler timeout tests on CI Sep 3, 2025
@ogrisel
Copy link
Contributor Author

ogrisel commented Sep 3, 2025

It seems that c248f8e fixed the root cause. Let's re-run the test timeout tests a few times to check that this is stable.

@psf-chronographer psf-chronographer bot added the bot:chronographer:provided (automation) changelog entry is part of PR label Sep 3, 2025
@ogrisel ogrisel changed the title TST attempt to fix faulthandler timeout tests on CI TST fix faulthandler timeout tests on CI Sep 3, 2025
@ogrisel ogrisel marked this pull request as ready for review September 3, 2025 10:17
@ogrisel
Copy link
Contributor Author

ogrisel commented Sep 3, 2025

Oh no! It failed again... The flush is not enough.

@ogrisel ogrisel marked this pull request as draft September 3, 2025 10:34
@ogrisel
Copy link
Contributor Author

ogrisel commented Sep 3, 2025

EDIT: not closing fds on linux and mac as done in bb4335d (and letting the external context manager handle it) does not fix the problem either.

I tried to reproduce locally by running the tests under Linux via docker and using python 3.14.0rc2, but I cannot reproduce the truncation.

I am running out of ideas to debug/fix that remaining failure on ubuntu-py314.

@ogrisel
Copy link
Contributor Author

ogrisel commented Sep 8, 2025

I could update this PR to only skip the tests on the linux + py314 combo since the truncation problem was never seen on other platforms.

@nicoddemus
Copy link
Member

I'm out of ideas too.

I could update this PR to only skip the tests on the linux + py314 combo since the truncation problem was never seen on other platforms.

Yes please do. We can always investigate more later.

@ogrisel ogrisel marked this pull request as ready for review September 8, 2025 15:07
@ogrisel
Copy link
Contributor Author

ogrisel commented Sep 8, 2025

@nicoddemus I think this is ready for review.

@ogrisel
Copy link
Contributor Author

ogrisel commented Sep 8, 2025

Let me just push an empty commit to check that the CI is stable.

@ogrisel
Copy link
Contributor Author

ogrisel commented Sep 9, 2025

@nicoddemus I think the CI is stable in this PR.

Copy link
Member

@nicoddemus nicoddemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ogrisel for all the hard work on this!

@nicoddemus nicoddemus merged commit 6ab228b into pytest-dev:main Sep 9, 2025
36 checks passed
@ogrisel ogrisel deleted the faulthandler-increase-timeout-in-test branch September 9, 2025 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided (automation) changelog entry is part of PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants