From 4306a910f1c24636158f934f1eba1f17a0a8809a Mon Sep 17 00:00:00 2001 From: Ezi Ozoani Date: Thu, 23 Nov 2023 23:08:59 +0000 Subject: [PATCH 1/4] Update model-card-annotated.md Updates to address [this issue](https://github.com/huggingface/hub-docs/issues/1125) - Addition of training regime in the annotated model card to keep this doc and the template in sync. - Defined training_regime, along with examples --- docs/hub/model-card-annotated.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/docs/hub/model-card-annotated.md b/docs/hub/model-card-annotated.md index f52e260d6..805a3f496 100644 --- a/docs/hub/model-card-annotated.md +++ b/docs/hub/model-card-annotated.md @@ -158,6 +158,9 @@ _Write 1-2 sentences on what the training data is. Ideally this links to a Datas ## Training Procedure [optional] +_When you want to know what hardware you'll need to fine-tune a model, consider the following factors: the number of parameters in the model and the training regime you plan to use._ + +_e.g For instance, a model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Amphere or higher hardware. Mixed pf16 requires at least 54GB of GPU memory._ ### Preprocessing @@ -166,6 +169,13 @@ _Write 1-2 sentences on what the training data is. Ideally this links to a Datas _Detail tokenization, resizing/rewriting (depending on the modality), etc._ +### Training Hyperparameters + + +* **Training regime:** training_regime` + +_Detail the model training process, specifically the type of precision used - whether it is **fp32/fp16/bf16** - and whether it is **mixed or non-mixed precision**?_ + ### Speeds, Sizes, Times From 23ba9926583cac1d968f3a3e877d5d483216e548 Mon Sep 17 00:00:00 2001 From: Ezi Ozoani Date: Thu, 23 Nov 2023 23:10:35 +0000 Subject: [PATCH 2/4] Update model-card-annotated.md Sentence rephrasing --- docs/hub/model-card-annotated.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hub/model-card-annotated.md b/docs/hub/model-card-annotated.md index 805a3f496..2efc0afea 100644 --- a/docs/hub/model-card-annotated.md +++ b/docs/hub/model-card-annotated.md @@ -160,7 +160,7 @@ _Write 1-2 sentences on what the training data is. Ideally this links to a Datas _When you want to know what hardware you'll need to fine-tune a model, consider the following factors: the number of parameters in the model and the training regime you plan to use._ -_e.g For instance, a model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Amphere or higher hardware. Mixed pf16 requires at least 54GB of GPU memory._ +_e.g A model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Amphere or higher hardware. Mixed pf16 requires at least 54GB of GPU memory._ ### Preprocessing From 44c32a289a466666d3ace15b79747916988a623d Mon Sep 17 00:00:00 2001 From: Ezi Ozoani Date: Fri, 24 Nov 2023 12:24:34 +0000 Subject: [PATCH 3/4] Update docs/hub/model-card-annotated.md Co-authored-by: Pedro Cuenca --- docs/hub/model-card-annotated.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hub/model-card-annotated.md b/docs/hub/model-card-annotated.md index 2efc0afea..3bcd73236 100644 --- a/docs/hub/model-card-annotated.md +++ b/docs/hub/model-card-annotated.md @@ -174,7 +174,7 @@ _Detail tokenization, resizing/rewriting (depending on the modality), etc._ * **Training regime:** training_regime` -_Detail the model training process, specifically the type of precision used - whether it is **fp32/fp16/bf16** - and whether it is **mixed or non-mixed precision**?_ +_Detail the model training process, specifically the type of precision used - whether it is **fp32/fp16/bf16** - and whether it is **mixed or non-mixed precision**_ ### Speeds, Sizes, Times From 66f5f603047bca0a8cad77b078f702635548bf58 Mon Sep 17 00:00:00 2001 From: Ezi Ozoani Date: Fri, 24 Nov 2023 12:24:43 +0000 Subject: [PATCH 4/4] Update docs/hub/model-card-annotated.md Co-authored-by: Pedro Cuenca --- docs/hub/model-card-annotated.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hub/model-card-annotated.md b/docs/hub/model-card-annotated.md index 3bcd73236..7b612a182 100644 --- a/docs/hub/model-card-annotated.md +++ b/docs/hub/model-card-annotated.md @@ -158,7 +158,7 @@ _Write 1-2 sentences on what the training data is. Ideally this links to a Datas ## Training Procedure [optional] -_When you want to know what hardware you'll need to fine-tune a model, consider the following factors: the number of parameters in the model and the training regime you plan to use._ +_When you want to know what hardware you'll need to train or fine-tune a model, consider the following factors: the number of parameters in the model and the training regime you plan to use._ _e.g A model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Amphere or higher hardware. Mixed pf16 requires at least 54GB of GPU memory._