Feature[add model mistralai/ministral3]#4293
Open
learncat163 wants to merge 9 commits intoPaddlePaddle:developfrom
Open
Feature[add model mistralai/ministral3]#4293learncat163 wants to merge 9 commits intoPaddlePaddle:developfrom
learncat163 wants to merge 9 commits intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (61.56%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #4293 +/- ##
==========================================
Coverage ? 35.00%
==========================================
Files ? 478
Lines ? 89900
Branches ? 0
==========================================
Hits ? 31467
Misses ? 58433
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR 新增 mistralai系列的ministral3 模型
权重信息
目前提供了 Ministral-3-3B-Instruct-2512 和 Ministral-3-8B-Instruct-2512 2个版本的支持和权重转换。
代码即可以直接加载HF上的原始权重,也可以支持paddle格式权重的直接加载。
精度对齐
使用
tests/transformers/ministral3/test_modeling.py的TestMistral3DiffAlignment类实现精度对齐测试断言(top10 token和logits diff)。token top 10 对齐
使用prompt: 'Hello, how are you today?'
输出的token ids
PyTorch 生成文本: " I'm fine, thank you. How about you"
Paddle 生成文本: " I'm fine, thank you. How about you"
最后一层输出的logits diff
logits max_diff: 5.912781e-05 (threshold: 0.01)
logits mean_diff: 3.532475e-06
微调loss下降对比
paddle使用配置
tests/config/ci/ministral3_sft.yamlms-swift需要特别注意,需要安装一下依赖,不然可能会有问题
pip install "mistral-common>=1.8.6" -Ums-swift使用配置如下:
注册模板my_register.py
启动命令