[orchagent] Honor createSwitchTimeout from sai.profile for all platforms#27761
Open
selvipal wants to merge 1 commit into
Open
[orchagent] Honor createSwitchTimeout from sai.profile for all platforms#27761selvipal wants to merge 1 commit into
selvipal wants to merge 1 commit into
Conversation
Collaborator
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
spilkey-cisco
approved these changes
Jun 8, 2026
orchagent.sh only consulted createSwitchTimeout (passed to orchagent as -t) inside a single vendor-specific branch. Make it platform-agnostic: read createSwitchTimeout from the hwsku sai.profile for every platform and pass it to orchagent when present. A vendor opts in simply by defining createSwitchTimeout in its sai.profile; when the key is absent, behavior is unchanged (orchagent default timeout). Any platform that previously defined the key in its sai.profile continues to work exactly as before. This is needed on platforms whose SAI create_switch exceeds orchagent's default 60s timeout (for example, ASICs that perform HBM DRAM training during initialization), where the timeout otherwise causes SAI_STATUS_FAILURE and an orchagent abort (swss crash) on boot. Signed-off-by: selvipal <selvipal@cisco.com>
e44f52e to
ce74474
Compare
Collaborator
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why I did it
On platforms with ASICs that have a long SAI
create_switchtime,create_switchcan take longer than orchagent's default 60s create-switch timeout. When that happens,create_switchreturnsSAI_STATUS_FAILURE, orchagent aborts, and swss crash-loops on boot.SONiC already supports a per-hwsku
createSwitchTimeoutknob insai.profile, butorchagent.shonly consulted it inside a single vendor-specific branch. This change makes the knob platform-agnostic so any vendor whose hardware needs a longer initialization window can opt in, without adding a per-platform check to common code.Work item tracking
How I did it
Reworked
dockers/docker-orchagent/orchagent.shto readcreateSwitchTimeoutfrom the hwskusai.profilefor all platforms and pass it to orchagent as-t <seconds>. Removed the vendor-specific block and added a single platform-agnostic block after the per-platform MAC handling.A vendor opts in purely by defining
createSwitchTimeoutin itssai.profile; if the key is absent, behavior is unchanged (orchagent default timeout). Any platform that previously defined the key in itssai.profilecontinues to work exactly as before.How to verify it
Add
createSwitchTimeout=<seconds>to the hwskusai.profileand restart swss (config reload/systemctl restart swss). Verify:ps -o args= -C orchagentshows... -t <seconds> ...orchagent: setRedisExtensionAttribute: set response timeout to <ms> msIf the key is not present, orchagent starts without
-t(default), confirming the change is opt-in and non-intrusive for other platforms.Verified on hardware whose SAI
create_switchexceeds the default 60s timeout:createSwitchTimeout120SAI_STATUS_FAILURE-> orchagent abort/crashWhich release branch to backport (provide reason below if selected)
Description for the changelog
[orchagent] Honor
createSwitchTimeoutfromsai.profilefor all platforms (not just a single vendor branch), so platforms with long SAIcreate_switchtimes can extend orchagent's create-switch timeout viasai.profile.Link to config_db schema for YANG module changes
N/A
A picture of a cute animal (not mandatory but encouraged)