-
Notifications
You must be signed in to change notification settings - Fork 54
Don't embed external initializers into the proto to avoid 2GB limit #817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ovep-develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR modifies the model proto generation to avoid embedding external initializers directly into the proto, preventing potential issues when models exceed the 2GB protobuf limit. The changes introduce a mechanism to handle initializers with external data by updating their metadata to use a special memory address marker instead of including the actual data.
Key changes:
- Add logic to exclude initializer data from proto generation when
load_user_initializer_
is enabled - Implement metadata updating for external initializers using the
*/_ORT_MEM_ADDR_/*
marker - Add comprehensive logging for debugging initializer processing
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
fd29454
to
2b34773
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
// but then we have to update those initializers and set the external_data fields to mem_addr tag... | ||
// proto is limited to 2GB, but let's use 512MB as threshold to be conservative and still gain some memory reductions. | ||
constexpr size_t MAX_EMBEDDED_INITIALIZER_SIZE = 1024 * 1024 * 512; | ||
const bool include_initializer_data_in_proto = !(session_context_.has_external_weights == true && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition session_context_.has_external_weights == true
is redundant. Simply use session_context_.has_external_weights
since it's already a boolean.
const bool include_initializer_data_in_proto = !(session_context_.has_external_weights == true && | |
const bool include_initializer_data_in_proto = !(session_context_.has_external_weights && |
Copilot uses AI. Check for mistakes.
subgraph.ToProto(*model_proto->mutable_graph(), true, true); | ||
subgraph.ToProto(*model_proto->mutable_graph(), /*include_initializers*/true, | ||
/*include_outer_scope_args*/true, /*execution_order*/0, /*include_initializer_data*/include_initializer_data_in_proto); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the trailing whitespace on this empty line.
Copilot uses AI. Check for mistakes.
|
||
print_model_proto_duration(); | ||
DumpOpenVINOEPModel(onnx_model_path_name, model_proto.get(), fused_node); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the trailing whitespace on this empty line.
Copilot uses AI. Check for mistakes.
… restores the metadata so OV can read them back Signed-off-by: bfilipek <[email protected]>
…re are external initializers in memory (more than one) Signed-off-by: bfilipek <[email protected]>
Signed-off-by: bfilipek <[email protected]>
…t initializers, comments, refactoring Signed-off-by: bfilipek <[email protected]>
7907cd7
to
ef6f23d
Compare
print_model_proto_duration(); | ||
DumpOpenVINOEPModel(onnx_model_path_name, model_proto.get(), fused_node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably dump the proto after we've finished transforming it in the case include_initializer_data_in_proto == false.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
openvinotoolkit/openvino#31632 was only added after OV in OV25.3, so we need to have a OV version check implemented. |
…r the logic is executed, check for OV version Signed-off-by: bfilipek <[email protected]>
Signed-off-by: bfilipek <[email protected]>
…ver 2GB to show the proto limit (when the new logic for ext initializers is enabled, then the test passes) Signed-off-by: bfilipek <[email protected]>
for (size_t i = 0; i < std::min<size_t>(10, floats_per_initializer); ++i) | ||
ASSERT_FLOAT_EQ(out_data[i], expected); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't hurt to check them all.
Signed-off-by: bfilipek <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
const bool include_initializer_data_in_proto = !(session_context_.has_external_weights && | ||
external_initializers_offset_and_length.size() > 1 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition external_initializers_offset_and_length.size() > 1
seems arbitrary. Consider using a named constant or documenting why specifically more than 1 external initializer is required to trigger this optimization.
const bool include_initializer_data_in_proto = !(session_context_.has_external_weights && | |
external_initializers_offset_and_length.size() > 1 && | |
// Optimization is only triggered if there is more than one external initializer, | |
// as the benefit of excluding data from the proto is only significant in that case. | |
constexpr size_t MIN_EXTERNAL_INITIALIZERS_FOR_OPTIMIZATION = 2; | |
const bool include_initializer_data_in_proto = !(session_context_.has_external_weights && | |
external_initializers_offset_and_length.size() >= MIN_EXTERNAL_INITIALIZERS_FOR_OPTIMIZATION && |
Copilot uses AI. Check for mistakes.
|
||
LOGS(logger, VERBOSE) << "In-memory initializer EXT: " << src_init->name() << ", size: " << length; | ||
|
||
SetExternalDataFields(proto_init, (const void*)offset, length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Casting offset
(which is a size_t representing a memory offset) directly to (const void*)
is incorrect. The offset should be added to a base pointer to get the actual memory address, not used as a pointer itself.
SetExternalDataFields(proto_init, (const void*)offset, length); | |
SetExternalDataFields(proto_init, static_cast<const void*>(static_cast<const uint8_t*>(external_initializers_data) + offset), length); |
Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <[email protected]>
Description
When calling subgraph.ToProto don't save initializers, so that we don't bloat the model proto. This is especially important for models with ext initializers in memory that in total contain more than 2GB of data.
Once we have the list of those external initializers, we can update ther metadata inside the proto, so that they leverage the special marker:
*/_ORT_MEM_ADDR_/*
. (The support for this location was recently fixed inside OV with openvinotoolkit/openvino#31632)Motivation and Context
CVS-173057, CVS-172710
Todo/questions