Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions primus/backends/megatron/training/mlflow_artifacts.py
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,12 @@ def _ensure_tracelens_installed() -> bool:
except subprocess.CalledProcessError as e:
warning_rank_0(f"[TraceLens] Failed to install TraceLens: {e}")
return False
except (PermissionError, OSError) as e:
warning_rank_0(f"[TraceLens] Failed to install TraceLens due to system error: {e}")
return False
except Exception as e:
warning_rank_0(f"[TraceLens] Failed to install TraceLens due to unexpected error: {e}")
return False
Comment on lines +255 to +260
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handling order has a logical issue. Since subprocess.CalledProcessError inherits from Exception, and OSError also inherits from Exception, the current order works. However, there's a subtle problem: subprocess.check_call() typically only raises subprocess.CalledProcessError for pip installation failures. File system and permission issues during the subprocess execution are usually wrapped into CalledProcessError. The OSError and PermissionError would only be raised if there are issues with the Python interpreter path itself or the subprocess module invocation, which are extremely rare cases. Consider whether these additional handlers provide value, or if they make the code unnecessarily complex. The catch-all Exception handler at the end provides the safety net for truly unexpected errors.

Copilot uses AI. Check for mistakes.


def _extract_rank_from_filename(filename: str) -> Optional[int]:
Expand Down
Loading