fix: AdvancedProfiler: ValueError: Attempting to stop recording a #21451

majiayu000 · 2025-12-26T13:56:01Z

Now I have a complete understanding of the changes. Let me generate the PR description:

Fix AdvancedProfiler ValueError when stopping non-started actions

Summary

The AdvancedProfiler raised a ValueError when attempting to stop profiling an action that was never started. This commonly occurred when using multiple Trainers with a shared profiler instance (e.g., during grid search) where only one trainer runs the test phase. The fix changes the stop() method to gracefully handle this case by logging a debug message and returning early instead of raising an exception.

Changes Made

src/lightning/pytorch/profilers/advanced.py:
- Modified the stop() method to log a debug message and return gracefully when attempting to stop an action that was never started, instead of raising ValueError
tests/tests_pytorch/profilers/test_profiler.py:
- Added test_advanced_profiler_multiple_trainers_test_only_one: Reproduces the exact bug scenario from issue AdvancedProfiler: ValueError: Attempting to stop recording an action (run_test_evaluation) which was never started. #9136 with multiple trainers sharing a profiler where only one runs test
- Added test_advanced_profiler_reused_trainer_test: Tests reusing a trainer for multiple test calls with profiling
- Added test_advanced_profiler_stop_nonexistent_action_no_error: Verifies that stopping non-existent actions doesn't raise errors and the profiler remains functional

Testing

The fix can be verified by:

Running the new test cases:

pytest tests/tests_pytorch/profilers/test_profiler.py::test_advanced_profiler_multiple_trainers_test_only_one
pytest tests/tests_pytorch/profilers/test_profiler.py::test_advanced_profiler_reused_trainer_test
pytest tests/tests_pytorch/profilers/test_profiler.py::test_advanced_profiler_stop_nonexistent_action_no_error

Running the full profiler test suite:

pytest tests/tests_pytorch/profilers/test_profiler.py -v

Checklist

Tests pass locally
Code follows project style guidelines
No breaking changes

📚 Documentation preview 📚: https://pytorch-lightning--21451.org.readthedocs.build/en/21451/

Fixes Lightning-AI#9136 Signed-off-by: majiayu000 <[email protected]>

for more information, see https://pre-commit.ci

fix: AdvancedProfiler: ValueError: Attempting to stop r...

5907d42

Fixes Lightning-AI#9136 Signed-off-by: majiayu000 <[email protected]>

majiayu000 requested review from ethanwharris, justusschock, lantiga and tchaton as code owners December 26, 2025 13:56

github-actions bot added the pl Generic label for PyTorch Lightning package label Dec 26, 2025

[pre-commit.ci] auto fixes from pre-commit.com hooks

6b9e77b

for more information, see https://pre-commit.ci

bhimrazy marked this pull request as draft January 14, 2026 08:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: AdvancedProfiler: ValueError: Attempting to stop recording a #21451

fix: AdvancedProfiler: ValueError: Attempting to stop recording a #21451

majiayu000 commented Dec 26, 2025 •

edited by github-actions bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: AdvancedProfiler: ValueError: Attempting to stop recording a #21451

Are you sure you want to change the base?

fix: AdvancedProfiler: ValueError: Attempting to stop recording a #21451

Conversation

majiayu000 commented Dec 26, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix AdvancedProfiler ValueError when stopping non-started actions

Summary

Changes Made

Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

majiayu000 commented Dec 26, 2025 •

edited by github-actions bot

Loading