Skip to content

Commit b277824

Browse files
committed
Revise blog on Tensor R-Fork add docs
1 parent 213994d commit b277824

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

blog/2025-12-03-rfork.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ To address this challenge, we have developed **a novel weight-loading framework
3737

3838
## Design
3939

40-
The core concept of Tensor R-Fork is to **leverage GPU-Direct RDMA for constructing a peer-to-peer (P2P) weight storage architecture.**
40+
The core concept of Tensor R-Fork[0] is to **leverage GPU-Direct RDMA for constructing a peer-to-peer (P2P) weight storage architecture.**
4141

4242
The performance of data transfer using traditional method is low, because there is always bottleneck in the entire path, whose bandwidth is much smaller than InfiniBand.
4343
From the data flow analysis, we observe that weight tensors are stored on each GPU and can be transmitted directly between nodes via GPU-direct RDMA.
@@ -144,6 +144,7 @@ The practice of R-Fork opens up more imaginative possibilities: the key concept
144144

145145
## Reference
146146

147+
[0] Tensor R-Fork Documentation: <a href=https://github.com/sgl-project/sglang/blob/main/docs/advanced_features/rfork.md>Documentation</a>
147148
[1] Tensor R-Fork with NCCL backend: <a href=https://github.com/sgl-project/sglang/pull/8215>PR#8215</a>
148149
[2] Tensor R-Fork with TransferEngine backend: <a href=https://github.com/sgl-project/sglang/pull/13125>PR#13125</a>
149150
[3] Concurrent weights loading from disk: <a href=https://github.com/sgl-project/sglang/pull/7943>PR#7943</a>

0 commit comments

Comments
 (0)