DeepseekMoE 结构详解和代码实现 - Zhang #214
Replies: 1 comment
-
|
你把future work的图放在这里?我就说为什么先做moe再做attention?害得我被骂 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
DeepseekMoE 结构详解和代码实现 - Zhang
从事 LLM 推理部署、视觉算法开发、模型压缩部署以及算法SDK开发工作,终身学习践行者。TransformerDeepseekMOE 计算流程,结构拆解及代码实现。
https://www.armcvai.cn/2025-02-12/deepseek-moe-code.html
Beta Was this translation helpful? Give feedback.
All reactions