Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suggestion: train.py启动参数优化 #14

Open
yuanzhoulvpi2017 opened this issue Dec 19, 2024 · 1 comment
Open

suggestion: train.py启动参数优化 #14

yuanzhoulvpi2017 opened this issue Dec 19, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@yuanzhoulvpi2017
Copy link

Megatron-lm的那套配置config,真的非常让人看懂。model_args、data_args、train_args 这种如果不熟悉的话,很难搞清楚。
我看了一下你们的代码:有create_config.py部分,然后在训练的时候,加载config.json文件?

当前思路确实很不错,但这种多个步骤,还是有点麻烦的。

可以按照hf的这种代码形式,

  1. 把相关的参数都是用dataclass定义好:可以写清楚每个变量是用来干嘛的。
  2. 相关约束,使用__post_init__约束好:各个参数会互相调整影响,在这里都非常容易做好调整。
# code copy from https://github.com/huggingface/transformers/blob/66ab300aaff9ef509f8736cf186ab9b6a0ef4f3b/examples/pytorch/language-modeling/run_clm.py#L242-L249

    parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments))
    if len(sys.argv) == 2 and sys.argv[1].endswith(".json"):
        # If we pass only one argument to the script and it's the path to a json file,
        # let's parse it to get our arguments.
        model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))
    else:
        model_args, data_args, training_args = parser.parse_args_into_dataclasses()

期待这个仓库越来越好~

@3outeille 3outeille added the enhancement New feature or request label Dec 19, 2024
@zzhhjjj
Copy link
Collaborator

zzhhjjj commented Dec 19, 2024

Thanks for your advice! I agree that we can make it clearer and easier to use

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants