You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is for discussing how to refactor the design of Mooncake Transfer Engine.
We acknowledge @doujiang24@yuan-luo@alogfans (may be incomplete) contributing PRs #161#134#147 to enhance Mooncake Transfer Engine. However these patches have major code modification. It make the code review and merge harder. Also, the interface also changed significantly, and backward compatibility may be affected.
So, before merging to the main stream, may be we can implement Transfer Engine V2 (we also want a short abbr. similar to nixl) seperately. When the second version is ready, we can replace it with the first one.
In this stage, we expect to do the following:
Use the Status-based error reporting mechanism, instead of error number. @yuan-luo
Update the metadata format, so a buffer can be registered by multiple transports. e.g. a memory region can use shared memory and rdma transports. @alogfans
Add shared memory transport and fix the problem of nvmeof transport. (environment needed)
When transferring between local processes, use shared memory if possible.
Try to eliminate the use of unsafe http-based handshake.
What about everyone's opinions and suggestions?
The text was updated successfully, but these errors were encountered:
@alogfans Sure. I'm fine with the plan. I also saw TransferEngine is on the sglang's roadmap. sgl-project/sglang#4655
Could you please shed some lights on more details?
This issue is for discussing how to refactor the design of Mooncake Transfer Engine.
We acknowledge @doujiang24 @yuan-luo @alogfans (may be incomplete) contributing PRs #161 #134 #147 to enhance Mooncake Transfer Engine. However these patches have major code modification. It make the code review and merge harder. Also, the interface also changed significantly, and backward compatibility may be affected.
So, before merging to the main stream, may be we can implement
Transfer Engine V2
(we also want a short abbr. similar tonixl
) seperately. When the second version is ready, we can replace it with the first one.In this stage, we expect to do the following:
Status
-based error reporting mechanism, instead of error number. @yuan-luoWhat about everyone's opinions and suggestions?
The text was updated successfully, but these errors were encountered: