You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when converting transformer models (like T5, but potentially others) to ONNX using the Optimum library, it appears to generate a single ONNX file encompassing the entire model architecture (both encoder and decoder). This occurs regardless of the specific task option selected during conversion.
I propose a feature that provides users with more granular control over the ONNX export process. Specifically, this feature should allow users to selectively export specific sub-modules of a transformer model, such as:
Only the encoder
Only the decoder
Potentially other distinct components of the model
This enhancement would enable users to optimize ONNX models for specific use cases where only a portion of the full model is required.
Evidence of the feasibility and need for this is the existence of separately exported encoder and decoder ONNX models for various transformer architectures on Hugging Face:
I am encountering a limitation with the current ONNX export functionality in Optimum. When converting transformer models, the resulting ONNX file invariably includes the entire model, even when I only require a specific part, like the encoder.
This is frustrating because:
Increased Model Size: The generated ONNX model is larger than necessary, consuming more storage and potentially impacting loading times.
Performance Overhead: When deploying the ONNX model for tasks that only utilize a specific sub-module (e.g., using only the encoder for embedding generation), the presence of the unnecessary decoder can introduce performance overhead.
Lack of Flexibility: The current approach lacks the flexibility to tailor the exported ONNX model to specific application needs.
As observed on Hugging Face, users have successfully exported individual components (like encoders and decoders) of various transformer models to ONNX. This indicates that it's technically possible and a desirable workflow. The Optimum library should provide a more direct and user-friendly way to achieve this without requiring manual workarounds.
Your contribution
While my direct expertise in the internal workings of the Optimum library for ONNX export is limited, I am willing to contribute by:
Testing: Thoroughly testing any implementation of this feature on various transformer models.
Providing Feedback: Offering detailed feedback on the usability and effectiveness of the new feature.
Sharing Use Cases: Providing specific use cases and examples that highlight the benefits of this functionality.
The text was updated successfully, but these errors were encountered:
Feature request
Currently, when converting transformer models (like T5, but potentially others) to ONNX using the Optimum library, it appears to generate a single ONNX file encompassing the entire model architecture (both encoder and decoder). This occurs regardless of the specific task option selected during conversion.
I propose a feature that provides users with more granular control over the ONNX export process. Specifically, this feature should allow users to selectively export specific sub-modules of a transformer model, such as:
This enhancement would enable users to optimize ONNX models for specific use cases where only a portion of the full model is required.
Evidence of the feasibility and need for this is the existence of separately exported encoder and decoder ONNX models for various transformer architectures on Hugging Face:
Motivation
I am encountering a limitation with the current ONNX export functionality in Optimum. When converting transformer models, the resulting ONNX file invariably includes the entire model, even when I only require a specific part, like the encoder.
This is frustrating because:
As observed on Hugging Face, users have successfully exported individual components (like encoders and decoders) of various transformer models to ONNX. This indicates that it's technically possible and a desirable workflow. The Optimum library should provide a more direct and user-friendly way to achieve this without requiring manual workarounds.
Your contribution
While my direct expertise in the internal workings of the Optimum library for ONNX export is limited, I am willing to contribute by:
The text was updated successfully, but these errors were encountered: