Revise blog on Tensor R-Fork add docs

amysaq2023 · amysaq2023 · commit b27782436e7b · 2025-12-09T16:47:24.000+08:00
diff --git a/blog/2025-12-03-rfork.md b/blog/2025-12-03-rfork.md
@@ -37,7 +37,7 @@ To address this challenge, we have developed **a novel weight-loading framework
 
 ## Design
 
-The core concept of Tensor R-Fork is to **leverage GPU-Direct RDMA for constructing a peer-to-peer (P2P) weight storage architecture.**
+The core concept of Tensor R-Fork[0] is to **leverage GPU-Direct RDMA for constructing a peer-to-peer (P2P) weight storage architecture.**
 
 The performance of data transfer using traditional method is low, because there is always bottleneck in the entire path, whose bandwidth is much smaller than InfiniBand. 
 From the data flow analysis, we observe that weight tensors are stored on each GPU and can be transmitted directly between nodes via GPU-direct RDMA.
@@ -144,6 +144,7 @@ The practice of R-Fork opens up more imaginative possibilities: the key concept
 
 ## Reference
 
+[0] Tensor R-Fork Documentation: <a href=https://github.com/sgl-project/sglang/blob/main/docs/advanced_features/rfork.md>Documentation</a>  
 [1] Tensor R-Fork with NCCL backend: <a href=https://github.com/sgl-project/sglang/pull/8215>PR#8215</a>  
 [2] Tensor R-Fork with TransferEngine backend: <a href=https://github.com/sgl-project/sglang/pull/13125>PR#13125</a>  
 [3] Concurrent weights loading from disk: <a href=https://github.com/sgl-project/sglang/pull/7943>PR#7943</a>