Add detailed ConvBERT model card with usage, architecture, and refere… #38470

Aesha19 · 2025-05-29T10:57:07Z

What does this PR do?

This PR adds a detailed and standardized model card for ConvBERT to improve Hugging Face Transformers documentation.

Includes:

Model Overview and Architecture
Training objective and dataset details
Use cases and limitations
Code usage examples via pipeline, AutoModel, and CLI
Quantization and AttentionMaskVisualizer support
Benchmarks and citation

File added:

src/transformers/models/convbert/modelcard.md

This contribution helps improve model discoverability and provides users with accessible and actionable information about ConvBERT.

cc: @stevhliu (documentation reviewer)

Fixes: N/A

…nces

Aesha19 · 2025-05-29T11:01:22Z

Hi! This is my first contribution — just checking in kindly for a review
cc: @stevhliu

stevhliu

Thanks, this is a nice start but I think you should revisit the template in the issue to make sure your model card follows the same format and language! As an example, take a look at the BERT docs :)

stevhliu · 2025-06-02T17:07:41Z

src/transformers/models/convbert/modelcard.md

@@ -0,0 +1,125 @@
+<!-- ConvBERT model card -->


You don't have to remove this

Suggested change

stevhliu · 2025-06-02T17:08:20Z

src/transformers/models/convbert/modelcard.md

+<div style="float: right;">
+    <div class="flex flex-wrap space-x-1">
+        <img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
+    </div>
+</div>


Missing the TensorFlow badge and this should go above # ConvBERT

stevhliu · 2025-06-02T17:08:33Z

src/transformers/models/convbert/modelcard.md

+---
+
+## Model Overview


Suggested change

---

## Model Overview

stevhliu · 2025-06-02T17:19:28Z

src/transformers/models/convbert/modelcard.md

+
+## Model Overview
+
+ConvBERT is a lightweight and efficient NLP transformer model introduced by YituTech. It improves on the classic BERT architecture by incorporating **span-based dynamic convolutions** into the self-attention mechanism. This hybrid approach enables ConvBERT to model both local and global dependencies more effectively while reducing the computational cost.


Suggested change

ConvBERT is a lightweight and efficient NLP transformer model introduced by YituTech. It improves on the classic BERT architecture by incorporating **span-based dynamic convolutions** into the self-attention mechanism. This hybrid approach enables ConvBERT to model both local and global dependencies more effectively while reducing the computational cost.

[ConvBERT](https://huggingface.co/papers/2008.02496) incorporates a mixed attention block that makes it more efficient than [BERT](./bert). Attention is costly because it models global word relationships. This is inefficient because some heads only learn local word relationships. ConvBERT replaces some of the attention heads with a convolution head to handle this. The result of this new mixed attention design is a more lightweight model with lower training costs without compromising performance.

Instead of using attention heads everywhere to model , ConvBERT also includes convolution heads to model local word relationships.

is a lightweight and efficient NLP transformer model introduced by YituTech. It improves on the classic BERT architecture by incorporating **span-based dynamic convolutions** into the self-attention mechanism. This hybrid approach enables ConvBERT to model both local and global dependencies more effectively while reducing the computational cost.

Add detailed ConvBERT model card with usage, architecture, and refere…

115f121

…nces

stevhliu mentioned this pull request Jun 2, 2025

[Community contributions] Model cards #36979

Open

stevhliu reviewed Jun 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add detailed ConvBERT model card with usage, architecture, and refere… #38470

Add detailed ConvBERT model card with usage, architecture, and refere… #38470

Uh oh!

Aesha19 commented May 29, 2025

Uh oh!

Aesha19 commented May 29, 2025

Uh oh!

stevhliu left a comment

Uh oh!

stevhliu Jun 2, 2025

Uh oh!

stevhliu Jun 2, 2025

Uh oh!

stevhliu Jun 2, 2025

Uh oh!

stevhliu Jun 2, 2025

Uh oh!

Uh oh!

-<!-- ConvBERT model card -->
+<!--Copyright 2020 The HuggingFace Team. All rights reserved.
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
+rendered properly in your Markdown viewer.
+-->


		## Model Overview

		ConvBERT is a lightweight and efficient NLP transformer model introduced by YituTech. It improves on the classic BERT architecture by incorporating span-based dynamic convolutions into the self-attention mechanism. This hybrid approach enables ConvBERT to model both local and global dependencies more effectively while reducing the computational cost.

Add detailed ConvBERT model card with usage, architecture, and refere… #38470

Are you sure you want to change the base?

Add detailed ConvBERT model card with usage, architecture, and refere… #38470

Uh oh!

Conversation

Aesha19 commented May 29, 2025

What does this PR do?

Uh oh!

Aesha19 commented May 29, 2025

Uh oh!

stevhliu left a comment

Choose a reason for hiding this comment

Uh oh!

stevhliu Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

stevhliu Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

stevhliu Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

stevhliu Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!