ai-dynamo · laikhtewari · Oct 2, 2025 · Oct 2, 2025 · Oct 2, 2025
diff --git a/components/backends/trtllm/kv-cache-transfer.md b/components/backends/trtllm/kv-cache-transfer.md
@@ -24,10 +24,10 @@ In disaggregated serving architectures, KV cache must be transferred between pre
 ## Default Method: UCX
 By default, TensorRT-LLM uses UCX (Unified Communication X) for KV cache transfer between prefill and decode workers. UCX provides high-performance communication optimized for GPU-to-GPU transfers.
 
-## Experimental Method: NIXL
-TensorRT-LLM also provides experimental support for using **NIXL** (NVIDIA Inference Xfer Library) for KV cache transfer. [NIXL](https://github.com/ai-dynamo/nixl) is NVIDIA's high-performance communication library designed for efficient data transfer in distributed GPU environments.
+## Beta Method: NIXL
+TensorRT-LLM also supports using **NIXL** (NVIDIA Inference Xfer Library) for KV cache transfer. [NIXL](https://github.com/ai-dynamo/nixl) is NVIDIA's high-performance communication library designed for efficient data transfer in distributed GPU environments.
 
-**Note:** NIXL support in TensorRT-LLM is experimental and is not suitable for production environments yet.
+**Note:** NIXL support in TensorRT-LLM is currently beta and may have some sharp edges. 
 
 ## Using NIXL for KV Cache Transfer
 
@@ -61,4 +61,4 @@ To enable NIXL for KV cache transfer in disaggregated serving:
 4. **Send the request:**
    See [client](./README.md#client) section to learn how to send the request to deployment.
 
-**Important:** Ensure that ETCD and NATS services are running before starting the service.
+**Important:** Ensure that ETCD and NATS services are running before starting the service.