Add size threshold to prevent constant folding from inflating model memory footprint by Copilot · Pull Request #28204 · microsoft/onnxruntime

Copilot · 2026-04-23T11:40:37Z

Description

Adds kOrtSessionOptionsConfigConstantFoldingNodeWeightSizeThreshold ("session.constant_folding_node_weight_size_threshold") — a session config option that caps the maximum net memory increase from constant-folding any single node. A node is skipped if total_output_size - freed_input_size > threshold, where freed_input_size accounts for input initializers that are exclusively consumed by the node being folded (and will be deleted after folding).

onnxruntime_session_options_config_keys.h: New config key. Value is a non-negative decimal integer; "0" (default) disables the check, preserving existing behavior.
constant_folding.cc: At the start of each ApplyImpl pass, reads the threshold once. For each candidate node, the estimated net size increase is computed from NodeArg type/shape info (element type × product of concrete dim values) before OptimizerExecutionFrame is created or any kernel is run. Input initializers with a single consumer are counted as freed. If the net increase exceeds the threshold, the node is skipped with an INFO-level log. When a dimension is symbolic or unknown, the pre-check is skipped and the node proceeds normally.
graph_transform_test.cc: ConstantFoldingWithSizeThreshold — verifies multiple scenarios: no threshold folds all nodes; a 1-byte threshold still folds Unsqueeze (net increase = 0 since inputs are exclusively consumed at equal size); a Tile graph with a 788-byte net increase is blocked by a 100-byte threshold and allowed by a 1000-byte threshold.

Example usage:

SessionOptions so;
so.config_options.AddConfigEntry(
    kOrtSessionOptionsConfigConstantFoldingNodeWeightSizeThreshold, "1048576"); // 1 MB cap

Motivation and Context

Constant folding materializes computed tensors as graph initializers. Without a size limit, a single large folded node can dramatically increase the in-memory model size relative to the original. This threshold gives users control over the memory/optimization trade-off. The check accounts for inputs that will be freed after folding, so only the true net memory increase is compared against the threshold. The check is performed before computation to avoid wasting CPU and memory on tensors that will ultimately be discarded.

- Add kOrtSessionOptionsConfigConstantFoldingNodeWeightSizeThreshold to onnxruntime_session_options_config_keys.h - Modify ConstantFolding::ApplyImpl to skip folding when computed output exceeds the threshold - Add ConstantFoldingWithSizeThreshold test to graph_transform_test.cc Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/68d58b05-bfab-48bc-9539-9d3ef5f4091e Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/68d58b05-bfab-48bc-9539-9d3ef5f4091e Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

xadupre · 2026-04-23T12:25:51Z

        }
      }

+      // If a size threshold was configured, check whether any output tensor exceeds it.


@copilot you need to move the test before the computation happens and use the estimated size before computation

Done in 38c9da7. The check now runs before the computation: for each output NodeArg, the elem type and all concrete dim values are read from the TypeAsProto, and estimated_size = num_elements * elem_size is compared against the threshold before OptimizerExecutionFrame is created or the kernel is run. If any dim is symbolic/unknown the pre-check is skipped for that output (safe fallback).

xadupre · 2026-04-23T12:26:29Z

+        for (const OrtValue& ort_value : fetches) {
+          if (ort_value.IsTensor()) {
+            const size_t tensor_size = ort_value.Get<Tensor>().SizeInBytes();
+            if (tensor_size > output_size_threshold) {


@copilot if inputs are deleted, you need to take that into account and only consider the increase

Done in dea8bad. The threshold check now computes the net memory increase: total_output_size - freed_input_size, where freed_input_size sums the sizes of input initializers that are exclusively consumed by this node (consumer count == 1). If the net increase ≤ threshold the node is folded; if > threshold it is skipped. For the Unsqueeze model, inputs are exclusively consumed (same size as outputs), so net = 0 and any non-zero threshold still allows folding. The updated test reflects this: Case 2 (1-byte threshold) now asserts 0 Unsqueeze remaining. Cases 3a/3b add a Tile graph with a genuine 788-byte net increase and verify the threshold blocks (100 bytes) or allows (1000 bytes) folding.

Per reviewer feedback, estimate output size from NodeArg type/shape info before running the kernel. This avoids the wasted computation when the folded output would exceed the configured threshold. - Read elem_type and concrete shape dims from each output NodeArg - If all dims are known static values, compute estimated_size = num_elements * elem_size - If estimated_size > output_size_threshold, skip node before OptimizerExecutionFrame is created - When shape is unknown (symbolic dims), proceed with computation (no pre-check possible) - Remove the old post-computation threshold check Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/d9f1c887-e20e-4f28-b8f0-a7362922ce19 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/505dd4fb-8403-4ebe-98cd-ec3a1ff16fd6 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Initial plan

80f9872

Copilot AI assigned Copilot and xadupre Apr 23, 2026

Copilot started work on behalf of xadupre April 23, 2026 11:40 View session

Copilot AI linked an issue Apr 23, 2026 that may be closed by this pull request

add a threshold above which constant folding is disabled #28203

Open

Copilot AI and others added 2 commits April 23, 2026 12:18

Rename threshold variable to output_size_threshold for clarity

43b5cf0

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/68d58b05-bfab-48bc-9539-9d3ef5f4091e Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add a threshold to disable constant folding when memory exceeds limit~~ Add size threshold to prevent constant folding from inflating model memory footprint Apr 23, 2026

Copilot finished work on behalf of xadupre April 23, 2026 12:21

Copilot AI requested a review from xadupre April 23, 2026 12:21

xadupre reviewed Apr 23, 2026

View reviewed changes

Copilot started work on behalf of xadupre April 23, 2026 12:26 View session

xadupre reviewed Apr 23, 2026

View reviewed changes

Copilot finished work on behalf of xadupre April 23, 2026 13:04

Copilot AI requested a review from xadupre April 23, 2026 13:04

Copilot started work on behalf of xadupre April 23, 2026 13:04 View session

Copilot AI and others added 2 commits April 23, 2026 13:17

Account for freed inputs in size threshold: compare net memory increase

b3a5439

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/505dd4fb-8403-4ebe-98cd-ec3a1ff16fd6 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Fix lambda capture in test: use [] instead of [&] for build_tile_graph

dea8bad

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/505dd4fb-8403-4ebe-98cd-ec3a1ff16fd6 Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com>

Copilot finished work on behalf of xadupre April 23, 2026 13:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add size threshold to prevent constant folding from inflating model memory footprint#28204

Add size threshold to prevent constant folding from inflating model memory footprint#28204
Copilot wants to merge 6 commits intomainfrom
copilot/add-threshold-for-constant-folding

Copilot AI commented Apr 23, 2026 •

edited

Loading

Uh oh!

xadupre Apr 23, 2026

Uh oh!

Copilot AI Apr 23, 2026

Uh oh!

xadupre Apr 23, 2026

Uh oh!

Copilot AI Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

xadupre Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

xadupre Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 23, 2026 •

edited

Loading