-
Notifications
You must be signed in to change notification settings - Fork 663
fix: update git version on pre1 #3140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: update git version on pre1 #3140
Conversation
Signed-off-by: Harrison King Saturley-Hall <[email protected]>
Signed-off-by: Harrison King Saturley-Hall <[email protected]>
|
Caution Review failedFailed to post review comments WalkthroughAdds multi-frontend SGLang orchestration (nginx + frontends), profiling hooks, and new benchmarking utilities (vLLM and SGLang). Expands TRT-LLM performance sweep scripts with disaggregated baseline flow and revised bench runner. Introduces new Dockerfiles/build steps, updates a submodule, adjusts a TRT-LLM handler, and bumps project versions. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Submitter as submit_job_script.py
participant Slurm as SLURM
participant NginxNode as nginx worker
participant Master as frontend (master)
participant Frontends as additional frontends
participant Prefill as prefill workers
participant Decode as decode workers
participant etcd as etcd/NATS
User->>Submitter: CLI args (--enable-multiple-frontends, profiler, etc.)
Submitter->>Slurm: sbatch job (with templated vars)
Slurm->>NginxNode: srun worker_setup.py --worker_type nginx
Slurm->>Master: srun worker_setup.py --worker_type frontend
Slurm->>Frontends: srun worker_setup.py --worker_type frontend (others)
Slurm->>Prefill: srun worker_setup.py --worker_type prefill
Slurm->>Decode: srun worker_setup.py --worker_type decode
Note over etcd,Master: Coordination endpoints (etcd/NATS)
Prefill-->>etcd: register/ready
Decode-->>etcd: register/ready
Frontends-->>Master: join/frontend ready
NginxNode-->>User: proxy http://<nginx>:8000
sequenceDiagram
autonumber
participant Bench as bench.sh / sglang_bench_serving.sh
participant Utils as benchmark_utils.sh
participant Warm as warmup_model()
participant Wait as wait_for_model()
participant Py as benchmark_serving.py
participant Backend as LLM Serving API
Bench->>Utils: source
Bench->>Wait: poll /health or /v1/models
Wait-->>Bench: ready
Bench->>Warm: warmup via sglang.bench_serving
Warm-->>Bench: done
loop for each concurrency
Bench->>Py: run with args (rate, prompts, max-concurrency)
Py->>Backend: async streaming requests
Backend-->>Py: tokens/chunks
Py-->>Bench: metrics JSON
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes Possibly related PRs
Poem
Pre-merge checks❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Overview:
The git version in the container as built needs to be updated to address CVE-2025-48384
Summary by CodeRabbit
New Features
Improvements
Documentation
Chores