|
15 | 15 |
|
16 | 16 | # Benchmarks |
17 | 17 |
|
18 | | -This directory contains benchmarking scripts and tools for performance evaluation. |
| 18 | +This directory contains benchmarking scripts and tools for performance evaluation of Dynamo deployments. The benchmarking framework is a wrapper around genai-perf that makes it easy to benchmark DynamoGraphDeployments and compare them with external endpoints. |
| 19 | + |
| 20 | +## Quick Start |
| 21 | + |
| 22 | +### Benchmark an Existing Endpoint |
| 23 | +```bash |
| 24 | +./benchmark.sh --namespace my-namespace --input my-endpoint=http://your-endpoint:8000 |
| 25 | +``` |
| 26 | + |
| 27 | +### Benchmark Dynamo Deployments |
| 28 | +```bash |
| 29 | +# Benchmark disaggregated vLLM with custom label |
| 30 | +./benchmark.sh --namespace my-namespace --input vllm-disagg=components/backends/vllm/deploy/disagg.yaml |
| 31 | + |
| 32 | +# Benchmark TensorRT-LLM disaggregated deployment |
| 33 | +./benchmark.sh --namespace my-namespace --input trtllm-disagg=components/backends/trtllm/deploy/disagg.yaml |
| 34 | + |
| 35 | +# Compare multiple Dynamo deployments |
| 36 | +./benchmark.sh --namespace my-namespace \ |
| 37 | + --input agg=components/backends/vllm/deploy/agg.yaml \ |
| 38 | + --input disagg=components/backends/vllm/deploy/disagg.yaml |
| 39 | + |
| 40 | +# Compare Dynamo vs external endpoint |
| 41 | +./benchmark.sh --namespace my-namespace \ |
| 42 | + --input dynamo=components/backends/vllm/deploy/disagg.yaml \ |
| 43 | + --input external=http://localhost:8000 |
| 44 | +``` |
| 45 | + |
| 46 | +**Note**: |
| 47 | +- The sample manifests may reference private registry images. Update the `image:` fields to use accessible images from [Dynamo NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo/artifacts) or your own registry before running. |
| 48 | +- Only DynamoGraphDeployment manifests are supported for automatic deployment. To benchmark non-Dynamo backends (vLLM, TensorRT-LLM, SGLang, etc.), deploy them manually using their Kubernetes guides and use the endpoint option. |
| 49 | + |
| 50 | +## Features |
| 51 | + |
| 52 | +The benchmarking framework supports: |
| 53 | + |
| 54 | +**Two Benchmarking Modes:** |
| 55 | +- **Endpoint Benchmarking**: Test existing HTTP endpoints without deployment overhead |
| 56 | +- **Deployment Benchmarking**: Deploy, test, and cleanup DynamoGraphDeployments automatically |
| 57 | + |
| 58 | +**Flexible Configuration:** |
| 59 | +- User-defined labels for each input using `--input label=value` format |
| 60 | +- Support for multiple inputs to enable comparisons |
| 61 | +- Customizable concurrency levels (configurable via CONCURRENCIES env var), sequence lengths, and models |
| 62 | +- Automated performance plot generation with custom labels |
| 63 | + |
| 64 | +**Supported Backends:** |
| 65 | +- DynamoGraphDeployments |
| 66 | +- External HTTP endpoints (for comparison with non-Dynamo backends) |
19 | 67 |
|
20 | 68 | ## Installation |
21 | 69 |
|
22 | | -This is already included as part of the dynamo vllm image. To install locally or standalone, run: |
| 70 | +This is already included as part of the Dynamo container images. To install locally or standalone: |
23 | 71 |
|
24 | 72 | ```bash |
25 | 73 | pip install -e . |
26 | 74 | ``` |
27 | 75 |
|
28 | | -Currently, this will install lightweight tools for: |
| 76 | +## Data Generation Tools |
| 77 | + |
| 78 | +This directory also includes lightweight tools for: |
29 | 79 | - Analyzing prefix-structured data (`datagen analyze`) |
30 | 80 | - Synthesizing structured data customizable for testing purposes (`datagen synthesize`) |
31 | | -Detailed information are provided in the `prefix_data_generator` directory. |
32 | 81 |
|
33 | | -The benchmarking scripts for the core dynamo components are to come soon (e.g. routing, disagg, Planner). |
| 82 | +Detailed information is provided in the `prefix_data_generator` directory. |
| 83 | + |
| 84 | +## Comprehensive Guide |
| 85 | + |
| 86 | +For detailed documentation, configuration options, and advanced usage, see the [complete benchmarking guide](../docs/benchmarks/benchmarking.md). |
0 commit comments