-
Notifications
You must be signed in to change notification settings - Fork 626
fix: 0.5.0 by reverting Dockerfile.vllm, build.sh, and run.sh (cherry pick 82bae247b56258a08e26bb6dd305e69981be98b0 from main) #2907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rs (#2892) Signed-off-by: Keiven Chang <[email protected]>
Caution Review failedFailed to post review comments. WalkthroughThis PR introduces namespace scoping across components, restructures graceful shutdown in the runtime, updates vLLM/sglang deployments to direct python invocations, adjusts planner input handling/logging, adds dev tooling changes (new Dockerfile stage, run.sh entrypoint override), overhauls Helm/cloud docs, and adds perf test manifests and example updates. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User
participant Frontend
participant Runtime
participant Endpoint
participant Tracker as GracefulShutdownTracker
User->>Frontend: SIGTERM / shutdown request
Frontend->>Runtime: shutdown()
Note over Runtime: Phase 1: Stop accepting new work
Runtime->>Runtime: cancel endpoint_shutdown_token
Runtime->>Tracker: wait_for_completion()
par Active endpoints finishing
loop each active endpoint
Runtime->>Endpoint: cancellation_token triggered
Endpoint-->>Tracker: unregister_endpoint()
end
and Wait for completion
Tracker-->>Runtime: all endpoints complete
end
Note over Runtime: Phase 2: System shutdown
Runtime->>Runtime: cancel main token (NATS/ETCD)
Runtime-->>Frontend: shutdown complete
sequenceDiagram
autonumber
participant Env as Env (DYN_NAMESPACE)
participant Frontend
participant PyBind as EntrypointArgs (Py/Rust)
participant LLM as LocalModelBuilder
participant Watcher as ModelWatcher
participant Store as etcd
Env-->>Frontend: namespace (optional)
Frontend->>PyBind: EntrypointArgs(namespace)
PyBind->>LLM: .namespace(namespace)
Frontend->>Store: subscribe to model events
Store-->>Watcher: WatchEvent stream
Frontend->>Watcher: watch(events, target_namespace)
alt target_namespace specified
Watcher->>Watcher: filter events by namespace
else global/None
Watcher->>Watcher: accept all namespaces
end
Watcher-->>Frontend: model updates (scoped)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
Overview:
This PR reverts Dockerfile.vllm and build/run scripts to maintain the same build and run behaviors as before August 28 commit 82bae24. This is to maintain backward compatibility.
Details:
local-dev
: For VS Code/Cursor Dev Container plugin use onlydev
: For command-line development with run.sh scriptWhere should the reviewer start?
Related Issues:
BUG-5501463
Summary by CodeRabbit