-
Notifications
You must be signed in to change notification settings - Fork 289
Update model contribution guide #2254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Update model contribution guide #2254
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left some comments.
CONTRIBUTING_MODELS.md
Outdated
|
||
- [ ] Open an issue or find an issue to contribute a backbone model. | ||
|
||
### Step 2: PR #1 - Add XXBackbone | ||
### Step 2: PR #1 - Model Folder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this many PRs? Might be better to ask for a single PR with backbone, initial task, and colab showing usage and results. Less likely to have incomplete model contributions.
I'd say definitely not this, we don't want people opening up PRs just to create empty model folders. That is just more review for us (with nothing of value in the PR).
CONTRIBUTING_MODELS.md
Outdated
|
||
### Step 4: PR #3 - Add XX Presets | ||
### Step 5: PR #3 - Add `XX` Tasks and Preprocessors (Optional) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to consider saying at least one task is not optional.
CONTRIBUTING_MODELS.md
Outdated
|
||
#### Unit Tests | ||
[Example](https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/distil_bert/distil_bert_backbone.py#L187-L189) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are adding a lot of outdated code links that no longer work. Please check all these!
and return the dictionary in the form expected by the model. | ||
- New Task Models (e.g., TokenClassifier, ImageSegmentation) | ||
- Parameter-Efficient Fine-Tuning (LoRA support) | ||
- Quantization (QLoRA support) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do we expect to be added for lora and qlora support? this seems kinda ill defined
CONTRIBUTING_MODELS.md
Outdated
- [ ] Add `xx/xx_presets.py` with links to weights uploaded to Kaggle KerasHub | ||
[Example](https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/distil_bert/distil_bert_presets.py) | ||
|
||
- [ ] Stage the model presets on KerasHub’s [Kaggle org page](https://www.kaggle.com/organizations/kerashub) using this [invite link](https://kaggle.com/organizations/kerashub/invite/c4b8baa532b8436e8df8f1ed641b9cb5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might not want to make this invite link public. Won't anyone be able to join the org with this? What kind of permissions does this get you (model creation? model deletion?).
|
||
### Step 4: PR #3 - Add XX Presets | ||
#### Checkpoint Conversion Script (tools/checkpoint_conversion/convert_your_model_checkpoints.py) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should mention timm/huggingface converters. And show one of those as the primary example.
Basically, we should say that is our preferred mode of checkpoint is to go from a supported conversion source (timm, transformers) and write a built in library converted. Then the convert checkpoint tool is just a thin wrapper around this converter.
Alternately (and more advanced) would be to write a converted from another format, directly in the tools/
script.
|
||
##### Implementation | ||
|
||
- **Text**: `XXTokenizer`, subclassing from KerasHub tokenizers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should link these code files (tokenizer, imageconverter, audioconverter).
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates the model contribution guide, providing a clearer and more comprehensive structure for contributors. Suggestions include fixing a potentially inaccessible link, correcting a typo, and improving section structure for enhanced clarity.
|
||
- [ ] A Colab notebook link in the PR description that matches the outputs of | ||
the implemented backbone model with the original source | ||
[Example](https://colab.sandbox.google.com/drive/1R99yFJCbxTEpcxFHa2RtlwQWahUIPCJC?usp=sharing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Major changes: [`DebertaV3`](https://github.com/keras-team/keras-hub/tree/master/keras_hub/models/deberta_v3) | ||
- Minor tweaks: [Whisper attention layer](https://github.com/keras-team/keras-hub/pull/801/files#diff-8533ae3a7755c0dbe95ccbb71f85c677297f687bf3884fadefc64f1d0fdce51aR22) | ||
|
||
Do **not** include `from_presets()` in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in the method name. The method is from_preset()
(singular), but here it's written as from_presets()
(plural). This is inconsistent with the KerasHub API and another mention in this guide (line 259). Correcting this will avoid confusion for contributors.
Do **not** include `from_presets()` in this PR. | |
Do **not** include `from_preset()` in this PR. |
- Provide a script that converts weights from their original format (e.g., | ||
PyTorch .bin, TensorFlow SavedModel) to the Keras H5 format expected by KerasHub. | ||
- This script should be reusable and clearly documented. | ||
- It's crucial for verifying weight conversion accuracy and for future updates. | ||
End-to-End Validation Colab | ||
- This is the most important validation step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
End-to-End Validation Colab | ||
- This is the most important validation step. | ||
|
||
After wrapping up the preset configuration file, you need to | ||
add the `from_preset` function to all three classes, i.e., `DistilBertBackbone`, | ||
and `DistilBertTokenizer`. Here is an | ||
[example](https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/distil_bert/distil_bert_backbone.py#L187-L189). | ||
#### End-to-End Colab | ||
|
||
The testing for presets is divided into two: "large" and "extra large". | ||
For "large" tests, we pick the smallest preset (in terms of number of parameters) | ||
and verify whether the output is correct. For "extra large tests", we loop over | ||
all the presets and just check whether the backbone and the tokenizer can | ||
be called without any error. | ||
- Load task model using `from_preset()` | ||
- Run task (e.g., classification, generation) | ||
- Compare output with original model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section has a redundant and confusing structure with two similar headings (End-to-End Validation Colab
and End-to-End Colab
). I suggest merging them into a single, clear section for the end-to-end validation Colab to improve readability.
End-to-End Validation Colab | |
- This is the most important validation step. | |
After wrapping up the preset configuration file, you need to | |
add the `from_preset` function to all three classes, i.e., `DistilBertBackbone`, | |
and `DistilBertTokenizer`. Here is an | |
[example](https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/distil_bert/distil_bert_backbone.py#L187-L189). | |
#### End-to-End Colab | |
The testing for presets is divided into two: "large" and "extra large". | |
For "large" tests, we pick the smallest preset (in terms of number of parameters) | |
and verify whether the output is correct. For "extra large tests", we loop over | |
all the presets and just check whether the backbone and the tokenizer can | |
be called without any error. | |
- Load task model using `from_preset()` | |
- Run task (e.g., classification, generation) | |
- Compare output with original model | |
#### End-to-End Validation Colab | |
This is the most important validation step. | |
- Load task model using `from_preset()` | |
- Run task (e.g., classification, generation) | |
- Compare output with original model |
Since KerasHub has evolved, updating the model contribution guides. This will make contributions consistent and lower review time.
Markdown preview here
https://github.com/divyashreepathihalli/keras-nlp/blob/contributing_guide/CONTRIBUTING_MODELS.md