Support for Exporting Specific Sub-Modules (e.g., Encoder, Decoder) #2148

happyme531 · 2025-01-03T14:48:36Z

Feature request

Currently, when converting transformer models (like T5, but potentially others) to ONNX using the Optimum library, it appears to generate a single ONNX file encompassing the entire model architecture (both encoder and decoder). This occurs regardless of the specific task option selected during conversion.

optimum-cli export onnx --model . . --task text-classification
optimum-cli export onnx --model . . --task feature-extraction

I propose a feature that provides users with more granular control over the ONNX export process. Specifically, this feature should allow users to selectively export specific sub-modules of a transformer model, such as:

Only the encoder
Only the decoder
Potentially other distinct components of the model

This enhancement would enable users to optimize ONNX models for specific use cases where only a portion of the full model is required.
Evidence of the feasibility and need for this is the existence of separately exported encoder and decoder ONNX models for various transformer architectures on Hugging Face:

Motivation

I am encountering a limitation with the current ONNX export functionality in Optimum. When converting transformer models, the resulting ONNX file invariably includes the entire model, even when I only require a specific part, like the encoder.

This is frustrating because:

Increased Model Size: The generated ONNX model is larger than necessary, consuming more storage and potentially impacting loading times.
Performance Overhead: When deploying the ONNX model for tasks that only utilize a specific sub-module (e.g., using only the encoder for embedding generation), the presence of the unnecessary decoder can introduce performance overhead.
Lack of Flexibility: The current approach lacks the flexibility to tailor the exported ONNX model to specific application needs.

As observed on Hugging Face, users have successfully exported individual components (like encoders and decoders) of various transformer models to ONNX. This indicates that it's technically possible and a desirable workflow. The Optimum library should provide a more direct and user-friendly way to achieve this without requiring manual workarounds.

Your contribution

While my direct expertise in the internal workings of the Optimum library for ONNX export is limited, I am willing to contribute by:

Testing: Thoroughly testing any implementation of this feature on various transformer models.
Providing Feedback: Offering detailed feedback on the usability and effectiveness of the new feature.
Sharing Use Cases: Providing specific use cases and examples that highlight the benefits of this functionality.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Exporting Specific Sub-Modules (e.g., Encoder, Decoder) #2148

Support for Exporting Specific Sub-Modules (e.g., Encoder, Decoder) #2148

happyme531 commented Jan 3, 2025

Support for Exporting Specific Sub-Modules (e.g., Encoder, Decoder) #2148

Support for Exporting Specific Sub-Modules (e.g., Encoder, Decoder) #2148

Comments

happyme531 commented Jan 3, 2025

Feature request

Motivation

Your contribution