Skip to content

Update model contribution guide #2254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

divyashreepathihalli
Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli commented May 16, 2025

Since KerasHub has evolved, updating the model contribution guides. This will make contributions consistent and lower review time.
Markdown preview here
https://github.com/divyashreepathihalli/keras-nlp/blob/contributing_guide/CONTRIBUTING_MODELS.md

@divyashreepathihalli divyashreepathihalli marked this pull request as draft May 16, 2025 20:27
@divyashreepathihalli divyashreepathihalli changed the title Update contributing guide [WIP] Update contributing guide May 16, 2025
@divyashreepathihalli divyashreepathihalli changed the title [WIP] Update contributing guide Update contributing guide May 16, 2025
@divyashreepathihalli divyashreepathihalli marked this pull request as ready for review May 16, 2025 23:02
@divyashreepathihalli divyashreepathihalli changed the title Update contributing guide Update model contribution guide May 16, 2025
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left some comments.


- [ ] Open an issue or find an issue to contribute a backbone model.

### Step 2: PR #1 - Add XXBackbone
### Step 2: PR #1 - Model Folder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this many PRs? Might be better to ask for a single PR with backbone, initial task, and colab showing usage and results. Less likely to have incomplete model contributions.

I'd say definitely not this, we don't want people opening up PRs just to create empty model folders. That is just more review for us (with nothing of value in the PR).


### Step 4: PR #3 - Add XX Presets
### Step 5: PR #3 - Add `XX` Tasks and Preprocessors (Optional)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to consider saying at least one task is not optional.


#### Unit Tests
[Example](https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/distil_bert/distil_bert_backbone.py#L187-L189)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are adding a lot of outdated code links that no longer work. Please check all these!

and return the dictionary in the form expected by the model.
- New Task Models (e.g., TokenClassifier, ImageSegmentation)
- Parameter-Efficient Fine-Tuning (LoRA support)
- Quantization (QLoRA support)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do we expect to be added for lora and qlora support? this seems kinda ill defined

- [ ] Add `xx/xx_presets.py` with links to weights uploaded to Kaggle KerasHub
[Example](https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/distil_bert/distil_bert_presets.py)

- [ ] Stage the model presets on KerasHub’s [Kaggle org page](https://www.kaggle.com/organizations/kerashub) using this [invite link](https://kaggle.com/organizations/kerashub/invite/c4b8baa532b8436e8df8f1ed641b9cb5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not want to make this invite link public. Won't anyone be able to join the org with this? What kind of permissions does this get you (model creation? model deletion?).


### Step 4: PR #3 - Add XX Presets
#### Checkpoint Conversion Script (tools/checkpoint_conversion/convert_your_model_checkpoints.py)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mention timm/huggingface converters. And show one of those as the primary example.

Basically, we should say that is our preferred mode of checkpoint is to go from a supported conversion source (timm, transformers) and write a built in library converted. Then the convert checkpoint tool is just a thin wrapper around this converter.

Alternately (and more advanced) would be to write a converted from another format, directly in the tools/ script.


##### Implementation

- **Text**: `XXTokenizer`, subclassing from KerasHub tokenizers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should link these code files (tokenizer, imageconverter, audioconverter).

@divyashreepathihalli
Copy link
Collaborator Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the model contribution guide, providing a clearer and more comprehensive structure for contributors. Suggestions include fixing a potentially inaccessible link, correcting a typo, and improving section structure for enhanced clarity.


- [ ] A Colab notebook link in the PR description that matches the outputs of
the implemented backbone model with the original source
[Example](https://colab.sandbox.google.com/drive/1R99yFJCbxTEpcxFHa2RtlwQWahUIPCJC?usp=sharing)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The example Colab link points to colab.sandbox.google.com, which is typically used for internal Google sandboxes and may not be accessible to external contributors. To ensure everyone can view the example, please update this to a publicly accessible colab.research.google.com link.

- Major changes: [`DebertaV3`](https://github.com/keras-team/keras-hub/tree/master/keras_hub/models/deberta_v3)
- Minor tweaks: [Whisper attention layer](https://github.com/keras-team/keras-hub/pull/801/files#diff-8533ae3a7755c0dbe95ccbb71f85c677297f687bf3884fadefc64f1d0fdce51aR22)

Do **not** include `from_presets()` in this PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo in the method name. The method is from_preset() (singular), but here it's written as from_presets() (plural). This is inconsistent with the KerasHub API and another mention in this guide (line 259). Correcting this will avoid confusion for contributors.

Suggested change
Do **not** include `from_presets()` in this PR.
Do **not** include `from_preset()` in this PR.

Comment on lines +250 to +255
- Provide a script that converts weights from their original format (e.g.,
PyTorch .bin, TensorFlow SavedModel) to the Keras H5 format expected by KerasHub.
- This script should be reusable and clearly documented.
- It's crucial for verifying weight conversion accuracy and for future updates.
End-to-End Validation Colab
- This is the most important validation step.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for the Checkpoint Conversion Script is missing a period at the end of the sentence.

Comment on lines +254 to +261
End-to-End Validation Colab
- This is the most important validation step.

After wrapping up the preset configuration file, you need to
add the `from_preset` function to all three classes, i.e., `DistilBertBackbone`,
and `DistilBertTokenizer`. Here is an
[example](https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/distil_bert/distil_bert_backbone.py#L187-L189).
#### End-to-End Colab

The testing for presets is divided into two: "large" and "extra large".
For "large" tests, we pick the smallest preset (in terms of number of parameters)
and verify whether the output is correct. For "extra large tests", we loop over
all the presets and just check whether the backbone and the tokenizer can
be called without any error.
- Load task model using `from_preset()`
- Run task (e.g., classification, generation)
- Compare output with original model

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This section has a redundant and confusing structure with two similar headings (End-to-End Validation Colab and End-to-End Colab). I suggest merging them into a single, clear section for the end-to-end validation Colab to improve readability.

Suggested change
End-to-End Validation Colab
- This is the most important validation step.
After wrapping up the preset configuration file, you need to
add the `from_preset` function to all three classes, i.e., `DistilBertBackbone`,
and `DistilBertTokenizer`. Here is an
[example](https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/distil_bert/distil_bert_backbone.py#L187-L189).
#### End-to-End Colab
The testing for presets is divided into two: "large" and "extra large".
For "large" tests, we pick the smallest preset (in terms of number of parameters)
and verify whether the output is correct. For "extra large tests", we loop over
all the presets and just check whether the backbone and the tokenizer can
be called without any error.
- Load task model using `from_preset()`
- Run task (e.g., classification, generation)
- Compare output with original model
#### End-to-End Validation Colab
This is the most important validation step.
- Load task model using `from_preset()`
- Run task (e.g., classification, generation)
- Compare output with original model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants