Skip to content

Conversation

@mdvoretc-intel
Copy link

@mdvoretc-intel mdvoretc-intel commented Oct 21, 2025

Description

When rewiring the graph after eliminating QDQ pairs, the runtime now checks whether the type matches before and after the eliminated nodes and inserts a Cast node if there is a mismatch.

Motivation and Context

At present, QDQ elimination assumes the floating point type is the same before the QuantizeLinear node and after the following DequantizeLinear, producing errors if the types mismatch.

If feature goes to new ABI?

Yes

Jira Ticket :

CVS-175447

@Kotomi-Du
Copy link

Kotomi-Du commented Oct 22, 2025

The casting node is for QuantizeLinear node correct? Could you add the before and after graph after this casting? Will it introduce any performance penalty?

@mdvoretc-intel
Copy link
Author

The cast is introduced when rewiring the graph to remove the QuantizeLinear and DequantizeLinear nodes, so that instead of a direct edge from e.g. a f32 input to a f16 output there's a Cast node added between them. The before and after are:
cast_before
cast_after
The resulting Convert node in OV is then optimized away in graph transformation stage, adding no specific performance impact.

Copy link

@mklimenk mklimenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the output type of the DQ node directly and fix the C++ Lint warnings (C-style casts).

@mdvoretc-intel
Copy link
Author

The issue that causes the use of a C-style cast for the input arg is that it is provided as const by OutputDefs but cannot be accepted as such by AddNode.

Copy link

@mklimenk mklimenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's roll back this unordered_map which is mainly unused. As for the C-style cast, this warning could be potentially suppressed by using the const_cast<T&>().

Copy link

@mklimenk mklimenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks!

@Kotomi-Du
Copy link

Kotomi-Du commented Oct 22, 2025

Could you open the PR to get review from reviewers with write access? Please also add the Jira ticket.

When rewiring the graph after eliminating QDQ pairs, the runtime now checks
whether the type matches before and after the eliminated nodes and inserts a
Cast node if there is a mismatch.
@mdvoretc-intel mdvoretc-intel marked this pull request as ready for review October 23, 2025 08:26
@MayureshV1 MayureshV1 changed the title [OVEP] Add a check for type mismatches in QDQ stripping CVS-175447-[OVEP] Add a check for type mismatches in QDQ stripping Oct 27, 2025
@Kotomi-Du
Copy link

@mklimenk This feature only impacts GPU, no impact on NPU right?

@mklimenk
Copy link

@mklimenk This feature only impacts GPU, no impact on NPU right?

Yes, this code isn't called by NPU path

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds type mismatch detection and automatic Cast node insertion during QDQ (Quantize-Dequantize) pair elimination in the OpenVINO provider. The change addresses issues where floating point types differed before QuantizeLinear and after DequantizeLinear nodes.

Key Changes:

  • Added type comparison logic before rewiring graph edges after QDQ elimination
  • Inserted automatic Cast node creation when type mismatches are detected between float and float16

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@MayureshV1 MayureshV1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the modifications. Looks Good !

@MayureshV1 MayureshV1 merged commit 20de366 into intel:ovep-develop Oct 29, 2025
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants