Skip to content

[Bug] GLM5模型转megatron报错 #1820

@llp1992

Description

@llp1992

Bug Description

当前提供的镜像及megatron core版本,无法支持GLM5模型hf转torch_dist,GLM5的架构还未支持。但是怎么本项目中能提供训练脚本?

Steps to Reproduce

当前提供的镜像及megatron core版本,无法支持GLM5模型hf转torch_dist,GLM5的架构还未支持。但是怎么本项目中能提供训练脚本?

Expected Behavior

正常转换GLM5

Actual Behavior

转换报错

Environment

  • slime version:
  • Python version:
  • PyTorch version:
  • CUDA/ROCm version:
  • GPU type and count:
  • OS:
  • SGLang version (if relevant):
  • Megatron-LM version (if relevant):

Logs

Additional Context

No response

Pre-submission Checklist

  • I have read the CONTRIBUTING.md and understand the collaboration scope.
  • I have read the documentation and my issue is not addressed there.
  • I have searched for existing issues and this is not a duplicate.
  • I have provided a minimal, reproducible example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions