Skip to content

Allow engine recovery to be executed in parallel with generation and wait for aliveness#942

Open
fzyzcjy wants to merge 3 commits intorollout_ft/24from
rollout_ft/25
Open

Allow engine recovery to be executed in parallel with generation and wait for aliveness#942
fzyzcjy wants to merge 3 commits intorollout_ft/24from
rollout_ft/25

Conversation

@fzyzcjy
Copy link
Copy Markdown
Collaborator

@fzyzcjy fzyzcjy commented Apr 7, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to ensure all engines are ready before proceeding with weight updates. Key feedback includes adding a null check for the server object to prevent a potential AttributeError, implementing a timeout for the engine readiness loop to avoid infinite blocking, and enhancing the readiness check to verify active connections rather than just internal state machine status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant