Skip to content

Commit bd7c943

Browse files
hhzhang16Jason Zhou
authored andcommitted
feat: add benchmarking guide (#2620)
Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: Jason Zhou <[email protected]>
1 parent ccb8288 commit bd7c943

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+2733
-921
lines changed

README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,13 @@ Rerun with `curl -N` and change `stream` in the request to `true` to get the res
151151
- Check out [Backends](components/backends) to deploy various workflow configurations (e.g. SGLang with router, vLLM with disaggregated serving, etc.)
152152
- Run some [Examples](examples) to learn about building components in Dynamo and exploring various integrations.
153153

154+
### Benchmarking Dynamo
155+
156+
Dynamo provides comprehensive benchmarking tools to evaluate and optimize your deployments:
157+
158+
* **[Benchmarking Guide](docs/benchmarks/benchmarking.md)** – Compare deployment topologies (aggregated vs. disaggregated vs. vanilla vLLM) using GenAI-Perf
159+
* **[Pre-Deployment Profiling](docs/benchmarks/pre_deployment_profiling.md)** – Optimize configurations before deployment to meet SLA requirements
160+
154161
# Engines
155162

156163
Dynamo is designed to be inference engine agnostic. To use any engine with Dynamo, NATS and etcd need to be installed, along with a Dynamo frontend (`python -m dynamo.frontend [--interactive]`).

benchmarks/README.md

Lines changed: 58 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,19 +15,72 @@
1515

1616
# Benchmarks
1717

18-
This directory contains benchmarking scripts and tools for performance evaluation.
18+
This directory contains benchmarking scripts and tools for performance evaluation of Dynamo deployments. The benchmarking framework is a wrapper around genai-perf that makes it easy to benchmark DynamoGraphDeployments and compare them with external endpoints.
19+
20+
## Quick Start
21+
22+
### Benchmark an Existing Endpoint
23+
```bash
24+
./benchmark.sh --namespace my-namespace --input my-endpoint=http://your-endpoint:8000
25+
```
26+
27+
### Benchmark Dynamo Deployments
28+
```bash
29+
# Benchmark disaggregated vLLM with custom label
30+
./benchmark.sh --namespace my-namespace --input vllm-disagg=components/backends/vllm/deploy/disagg.yaml
31+
32+
# Benchmark TensorRT-LLM disaggregated deployment
33+
./benchmark.sh --namespace my-namespace --input trtllm-disagg=components/backends/trtllm/deploy/disagg.yaml
34+
35+
# Compare multiple Dynamo deployments
36+
./benchmark.sh --namespace my-namespace \
37+
--input agg=components/backends/vllm/deploy/agg.yaml \
38+
--input disagg=components/backends/vllm/deploy/disagg.yaml
39+
40+
# Compare Dynamo vs external endpoint
41+
./benchmark.sh --namespace my-namespace \
42+
--input dynamo=components/backends/vllm/deploy/disagg.yaml \
43+
--input external=http://localhost:8000
44+
```
45+
46+
**Note**:
47+
- The sample manifests may reference private registry images. Update the `image:` fields to use accessible images from [Dynamo NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo/artifacts) or your own registry before running.
48+
- Only DynamoGraphDeployment manifests are supported for automatic deployment. To benchmark non-Dynamo backends (vLLM, TensorRT-LLM, SGLang, etc.), deploy them manually using their Kubernetes guides and use the endpoint option.
49+
50+
## Features
51+
52+
The benchmarking framework supports:
53+
54+
**Two Benchmarking Modes:**
55+
- **Endpoint Benchmarking**: Test existing HTTP endpoints without deployment overhead
56+
- **Deployment Benchmarking**: Deploy, test, and cleanup DynamoGraphDeployments automatically
57+
58+
**Flexible Configuration:**
59+
- User-defined labels for each input using `--input label=value` format
60+
- Support for multiple inputs to enable comparisons
61+
- Customizable concurrency levels (configurable via CONCURRENCIES env var), sequence lengths, and models
62+
- Automated performance plot generation with custom labels
63+
64+
**Supported Backends:**
65+
- DynamoGraphDeployments
66+
- External HTTP endpoints (for comparison with non-Dynamo backends)
1967

2068
## Installation
2169

22-
This is already included as part of the dynamo vllm image. To install locally or standalone, run:
70+
This is already included as part of the Dynamo container images. To install locally or standalone:
2371

2472
```bash
2573
pip install -e .
2674
```
2775

28-
Currently, this will install lightweight tools for:
76+
## Data Generation Tools
77+
78+
This directory also includes lightweight tools for:
2979
- Analyzing prefix-structured data (`datagen analyze`)
3080
- Synthesizing structured data customizable for testing purposes (`datagen synthesize`)
31-
Detailed information are provided in the `prefix_data_generator` directory.
3281

33-
The benchmarking scripts for the core dynamo components are to come soon (e.g. routing, disagg, Planner).
82+
Detailed information is provided in the `prefix_data_generator` directory.
83+
84+
## Comprehensive Guide
85+
86+
For detailed documentation, configuration options, and advanced usage, see the [complete benchmarking guide](../docs/benchmarks/benchmarking.md).

0 commit comments

Comments
 (0)