Skip to content

Conversation

@majiayu000
Copy link

@majiayu000 majiayu000 commented Dec 26, 2025

Now I have a complete understanding of the changes. Let me generate the PR description:

Fix AdvancedProfiler ValueError when stopping non-started actions

Closes #9136

Summary

The AdvancedProfiler raised a ValueError when attempting to stop profiling an action that was never started. This commonly occurred when using multiple Trainers with a shared profiler instance (e.g., during grid search) where only one trainer runs the test phase. The fix changes the stop() method to gracefully handle this case by logging a debug message and returning early instead of raising an exception.

Changes Made

  • src/lightning/pytorch/profilers/advanced.py:

    • Modified the stop() method to log a debug message and return gracefully when attempting to stop an action that was never started, instead of raising ValueError
  • tests/tests_pytorch/profilers/test_profiler.py:

Testing

The fix can be verified by:

  1. Running the new test cases:

    pytest tests/tests_pytorch/profilers/test_profiler.py::test_advanced_profiler_multiple_trainers_test_only_one
    pytest tests/tests_pytorch/profilers/test_profiler.py::test_advanced_profiler_reused_trainer_test
    pytest tests/tests_pytorch/profilers/test_profiler.py::test_advanced_profiler_stop_nonexistent_action_no_error
  2. Running the full profiler test suite:

    pytest tests/tests_pytorch/profilers/test_profiler.py -v

Checklist

  • Tests pass locally
  • Code follows project style guidelines
  • No breaking changes

📚 Documentation preview 📚: https://pytorch-lightning--21451.org.readthedocs.build/en/21451/

@github-actions github-actions bot added the pl Generic label for PyTorch Lightning package label Dec 26, 2025
@bhimrazy bhimrazy marked this pull request as draft January 14, 2026 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pl Generic label for PyTorch Lightning package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AdvancedProfiler: ValueError: Attempting to stop recording an action (run_test_evaluation) which was never started.

1 participant