Releases: AI-Hypercomputer/ml-goodput-measurement
Releases · AI-Hypercomputer/ml-goodput-measurement
Release: ml-goodput-measurement v0.0.15
Changes:
- Refactor the GoodputMonitor to use multiprocessing instead of multithreading.
- Add more context with structured logging for metric uploads to GCM Monitoring.
Full Changelog: v0.0.14...v0.0.15
Release ml-goodput-measurement v0.0.13
PyPi Package can be installed from here
Changelog:
[0.0.13] - 2025-07-25
- Fix occasional gaps in cumulative Goodput Monitor dashboard.
[0.0.12] - 2025-06-25
- Support monitoring disruption badput due to infrastructure recovery.
- Support monitoring ideal step time.
- Code clean up and bug fixes.
[0.0.11] - 2025-06-06
- Support for monitoring performance degradations.
- Support for monitoring rolling window Goodput.
- Force upload of final metrics and safe exit.
[0.0.10] - 2025-04-28
- Support for custom badput events which are synchronous and training-overlapped.
- Handling of edge case caching scenario.
[0.0.9] - SKIPPED
- Used for external testing. Please upgrade to 0.0.10.
[0.0.8] - 2025-04-03
- Fix computation of ideal step time when step_times is empty.
[0.0.7] - 2025-03-24
- Cache updates to Other/Unknown Badput.
- Exclude monitoring asynchronous Badput types in GCM.
- Total and last step updates with hidden events.
- Interval Query Monitoring in GCM.
[0.0.6] - 2025-03-17
- Updates to data loading Badput buckets (Separated into Async & Sync).
- Short term fix to Pathways SuspendResume anomalous step time detection.
- Updates to account for Pathways Elastic Training.
- Automatic asynchronous upload of goodput, badput and step time deviation metrics to GCM.
Release ml-goodput-measurement v0.0.5
[0.0.5] - 2025-02-03
- Goodput Cache and library improvements.
- Query and Monitor API support for checkpoint save and restore.
- Interval Query API support.
- Query and Monitor API support for step time deviation.
PyPi Package can be installed from here
Initial release of ML Goodput Measurement Library
- Bug Fixes
- Fixes a typing mismatch in total step time calculation.
- Code cleanup
Full Changelog: v0.0.1...v0.0.2
Initial release of ML Goodput Measurement PyPi package
- Feature: Contains the Goodput module which allows logging and retrieval of training job's overall productive Goodput