Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow macOS - "##[error]The job running on agent Azure Pipelines 9 ran longer than the maximum time of 60 minutes." #1883

Open
3 tasks
jeffschwMSFT opened this issue Jan 24, 2024 · 7 comments

Comments

@jeffschwMSFT
Copy link
Member

jeffschwMSFT commented Jan 24, 2024

Build

https://dnceng.visualstudio.com/internal/_build/results?buildId=2360768&view=results

Error

##[error]The job running on agent Azure Pipelines 9 ran longer than the maximum time of 60 minutes. For more information, see https://go.microsoft.com/fwlink/?linkid=2077134

Build leg reported

vsos

Pull Request

No response

Known issue core information

Fill out the known issue JSON section by following the step by step documentation on how to create a known issue

 {
    "ErrorMessage" : "",
    "BuildRetry": false,
    "ErrorPattern": "The job running on agent Azure Pipelines .+ ran longer than the maximum time of .+ minutes.",
    "ExcludeConsoleLog": false
 }

@dotnet/dnceng

Release Note Category

  • Feature changes/additions
  • Bug fixes
  • Internal Infrastructure Improvements

Release Note Description

Additional information about the issue reported

No response

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng/internal/_build/results?buildId=2360768
Error message validated: [The job running on agent Azure Pipelines .+ ran longer than the maximum time of .+ minutes.]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 2/7/2024 1:03:13 AM UTC

Report

Build Definition Step Name Console log Pull Request
2674490 dotnet-runtime Performance ios_scenarios iOSMono JIT iOSLlvmBuild iOSStripSymbols osx x64 perfiphone12mini net10.0 Log
983276 dotnet/runtime maccatalyst-arm64 Release AllSubsets_Mono Log dotnet/runtime#113313
996607 dotnet/performance Performance micro windows 22H2 x86 Open 8.0 Log
2673106 dotnet-performance Performance scenarios ubuntu 2204 arm64 Ampere main Log
995094 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47869
995086 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47744
995080 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47873
995064 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47924
995050 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47923
995041 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47888
995040 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47847
995004 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47880
995003 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47879
994818 dotnet/performance Performance akadeindexedset ubuntu 2204 x64 Open 8.0 Log
994506 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47873
994501 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47913
994372 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47767
994394 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47911
994392 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47910
2672330 dotnet-performance Performance nativeaot_scenarios windows Win11 x64 Tiger 8.0 Log
994360 dotnet/roslyn Test_macOS_Debug Log dotnet/roslyn#77835
994369 dotnet/xharness Helix Tests Build_Debug Log
994374 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47869
994353 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47880
994351 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47879
994339 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47909
994337 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47908
994279 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47907
994274 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47833
994148 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47702
994242 dotnet/xharness Helix Tests Build_Debug Log dotnet/xharness#1368
994244 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47874
994233 dotnet/xharness Helix Tests Build_Debug Log dotnet/xharness#1384
994053 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47901
994050 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47902
994047 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47899
994041 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47897
994026 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47896
994025 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47895
994023 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47894
994014 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47893
994008 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47892
993355 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47877
993456 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47879
994000 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47891
994180 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47888
994178 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47847
994049 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47900
993837 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47869
993687 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47883
994089 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47903
993459 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47880
994042 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47898
994030 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47853
994031 dotnet/sdk Darwin_AoT_Tests Build_Release Log dotnet/sdk#47826
993875 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47886
993845 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47885
993840 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47611
993831 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47867
993819 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#46218
993535 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47800
993772 dotnet/runtime osx-arm64 Release NativeAOT_Libraries Log dotnet/runtime#113713
993750 dotnet/runtime osx-arm64 Release NativeAOT_Libraries Log dotnet/runtime#113905
993747 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47873
993731 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47833
993717 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47875
993705 dotnet/runtime osx-arm64 Release NativeAOT_Libraries Log dotnet/runtime#113903
993701 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47744
993698 dotnet/sdk Darwin_AoT_Tests Build_Release Log dotnet/sdk#47884
993691 dotnet/runtime osx-arm64 Release NativeAOT_Libraries Log dotnet/runtime#113190
993357 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47819
993635 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47846
993621 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#46611
993585 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47881
993432 dotnet/runtime osx-arm64 Release NativeAOT Log dotnet/runtime#113893
993393 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47869
993367 dotnet/runtime osx-arm64 Release NativeAOT Log dotnet/runtime#113892
993388 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47702
993377 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47818
993364 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47878
993359 dotnet/performance Performance bepuphysics ubuntu 2204 x64 Open main Log dotnet/performance#4792
993286 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47876
993281 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47833
993259 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47875
993017 dotnet/performance Performance powershell windows 22H2 x64 Open 9.0 Log
2671642 dotnet-performance Performance maui_scenarios_ios CoreCLR osx 14 x64 iPhoneMini12 8.0 Log
993014 dotnet/performance Performance bepuphysics ubuntu 2204 x64 Open 9.0 Log
2671581 dotnet-performance Performance maui_scenarios_android Mono osx 14 x64 Pixel main Log
992884 dotnet/performance Performance micro ubuntu 2204 x64 Open 9.0 Log dotnet/performance#4770
992719 dotnet/performance Performance mlnet ubuntu 2204 x64 Open 8.0 Log
992342 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47855
992341 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47856
992324 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47853
992323 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47852
992319 dotnet/sdk AoT: macOS (x64) Log dotnet/sdk#47851
992300 dotnet/sdk TestBuild: macOS (x64) Log dotnet/sdk#47850
2670980 dotnet-runtime Libraries SuperPMI collection libraries_tests Checked coreclr osx arm64 Release Log
2670838 dotnet-runtime Libraries SuperPMI collection libraries_tests_no_tiered_compilation Checked coreclr osx arm64 Release Log
991473 dotnet/performance Performance micro windows 22H2 x86 Open main Log dotnet/performance#4792
990486 dotnet/runtime osx-x64 Release NativeAOT Log dotnet/runtime#113808
Displaying 100 of 235 results

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 106 235
@lewing
Copy link
Member

lewing commented Feb 9, 2024

🤔

@nagilson
Copy link
Member

nagilson commented Oct 3, 2024

@dotnet/dnceng @dougbu This has impacted a lot of PRs recently, may you PTAL into expanding this Mac resource?

@ivanpovazan
Copy link
Member

@dotnet/dnceng we are hitting this again and it seems there is some issue with communication with Helix machines.

More context on timeouts happening in : https://dev.azure.com/dnceng-public/public/_build/results?buildId=930014&view=results

  • On success: Run tests in Helix step reports:
Waiting for completion of job fbec1164-55c5-4fa8-b492-e1ba1b413119 on osx.1200.amd64.open (Details: https://helix.dot.net/api/jobs/fbec1164-55c5-4fa8-b492-e1ba1b413119/details?api-version=2019-06-17 )
Job 7fdc2720-3d98-4bf3-8113-274dacd69c91 on osx.1200.arm64.open is completed with 6 finished work items.
  Job fbec1164-55c5-4fa8-b492-e1ba1b413119 on osx.1200.amd64.open is completed with 6 finished work items.
  Stopping Azure Pipelines Test Run Helix Tests Build_Debugosx.1200.amd64.open (Results: https://dev.azure.com/dnceng-public/public/_build/results?buildId=923987&view=ms.vss-test-web.build-test-results-tab )
  Stopping Azure Pipelines Test Run Helix Tests Build_Debugosx.1200.arm64.open (Results: https://dev.azure.com/dnceng-public/public/_build/results?buildId=923987&view=ms.vss-test-web.build-test-results-tab )

Build succeeded.

SENDHELIXJOB : warning : Helix queue osx.1200.amd64.open was set for estimated removal date of 2025-01-01. In most cases the queue will be removed permanently due to end-of-life; please contact dnceng for any questions or concerns, and we can help you decide how to proceed and discuss other options. [/home/vsts/work/1/s/tests/integration-tests/Apple/Simulator.Tests.proj]
SENDHELIXJOB : warning : Helix queue osx.1200.arm64.open was set for estimated removal date of 2025-01-01. In most cases the queue will be removed permanently due to end-of-life; please contact dnceng for any questions or concerns, and we can help you decide how to proceed and discuss other options. [/home/vsts/work/1/s/tests/integration-tests/Apple/Simulator.Tests.proj]
    2 Warning(s)
    0 Error(s)

Time Elapsed 00:03:14.97
Killing running build processes...

Finishing: Run tests in Helix

ref: https://dev.azure.com/dnceng-public/public/_build/results?buildId=923987&view=logs&j=ccc97bb6-1a23-5e71-fdfa-3cdca4a74749&t=27fc7eb2-ead9-59e1-6679-a637855d40c5

  • While on failure - timeout the same step gets stuck with:
Waiting for completion of job 967d92a2-ec10-4332-927f-d28a6563f367 on osx.1200.arm64.open (Details: https://helix.dot.net/api/jobs/967d92a2-ec10-4332-927f-d28a6563f367/details?api-version=2019-06-17 )
  Job 5427159b-500d-49f1-aac0-ec148a492bbe on osx.1200.amd64.open is completed with 6 finished work items.

ref: https://dev.azure.com/dnceng-public/public/_build/results?buildId=930014&view=logs&s=c58bc33c-b825-5bca-90ca-50f6e9293dd8&j=e6966639-fe40-5068-d9ae-681cccecafdf

NOTE: All the tests successfully passed on Helix, but it seems that the communication is lost.

@garath
Copy link
Member

garath commented Jan 27, 2025

Looks like the timeout happened because the osx.1200.arm64.open queue was very busy while the job was running. Right now, I do not think there are any problems with the infrastructure. I will investigate a bit more to see what caused such a back-up.

@garath garath self-assigned this Jan 27, 2025
@garath
Copy link
Member

garath commented Jan 28, 2025

Ah, the queue was consumed with updates and patching. The patching jobs did run longer than necessary and we've communicated with our partner team about the issue. Future jobs will be much shorter and should not overly impact jobs.

@garath garath removed their assignment Jan 29, 2025
@ivanpovazan
Copy link
Member

ivanpovazan commented Feb 13, 2025

Looks like the timeout happened because the osx.1200.arm64.open queue was very busy while the job was running. Right now, I do not think there are any problems with the infrastructure. I will investigate a bit more to see what caused such a back-up.

We are still experiencing the problem.

Should we try to change the queue to osx.13.arm64?

@dotnet/dnceng

@ilyas1974
Copy link
Contributor

Looks at the failing builds, I'm noticing the jobs that are timing out in the hosted pool (Azure Pipelines) are still using the older hardware. I would recommend moving the workloads to the mac-latest-internal or mac-14-arm64 agent specifications. These have the latest mac hardware associated with them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants