Bert training update #785
Triggered via pull request
February 12, 2025 13:16
Status
Failure
Total duration
6h 3m 16s
Artifacts
–
test_trainium_distributed.yml
on: pull_request
Run distributed tests on Trainium 1
6h 0m
Annotations
2 errors
Run distributed tests on Trainium 1
The job running on runner aws-trn1-32xlarge-use1-public-80-hsr8z-runner-fbmvz has exceeded the maximum execution time of 360 minutes.
|
Run distributed tests on Trainium 1
The operation was canceled.
|