[RFC] Constant tensors caching pass #183

niuxiaog · 2024-07-24T08:25:18Z

Related to issue #146 .

ZhennanQin · 2024-07-26T01:18:27Z

docs/constant_tensors_cache.md

+        %feature3 = linalg.matmul(%feature2, %foldedWeight2)
+        return %feature3
+    }
+    compiletime_fold(%weight1: tensor<*xbf16>) -> %foldedWeight0, %foldedWeight1: tensor<*xbf16>, tensor<*xbf16> {


compiletime_fold will be a problem because:

The kernel binary will contain the whole tensor to be folded, which is too large. If we only want to use runtime_fold from it, the binary size is wasted and this is not friendly for kernel cache.

For GPU device, we may want to do the folding on CPU. We shouldn't put three functions in the same module to achieve that.

If we want to support direct compile-time folding, I suggest following the direction of section 2.1 to implement it.

For the compile time available tensors, the integration can choose to:

lower them to arith.constant, which is not suggested,

put them into the arguments list of the module and mark them as compiletime_const_args,

put them into the arguments list of the module and mark them as runtime_const_args.

For the first two choices, they will be folded by compiletime_fold, and for the third choice, by runtime_fold. There will be no large literal tensors in the kernel for the last two choices.

We shouldn't put three functions in the same module to achieve that.

I'm not clear if we can generate a new module in the pass pipeline. If so, shall we put the compiletime_fold in one module, and runtime_fold and compute in another module?

I'm not clear if we can generate a new module in the pass pipeline. If so, shall we put the compiletime_fold in one module, and runtime_fold and compute in another module?

I think so. But my current thinking is, we can support compiletime_fold in the future when there's a demand for this.

OK, I will put all folding operations into runtime_fold.

lmontigny · 2024-08-20T13:33:18Z

waiting from OV integration, how they handle const weight..?
with TPP -> use it in IR as const.
to discuss with OV on iteration 4.

lmontigny

RFC LGTM
Do we have an idea of the performance impact of this pass on some example?

niuxiaog · 2024-09-13T05:56:10Z

RFC LGTM Do we have an idea of the performance impact of this pass on some example?

Ideally, the execution time of operations on weights of matmuls, such as tensor.pack and linalg.transpose, will be saved. The ratio of improvement will depend on actual models. We will collect performance data after all is ready.

niuxiaog added 2 commits July 23, 2024 01:28

Add rfc doc

3355867

Enhance

55828bd

niuxiaog added documentation Improvements or additions to documentation ready to review labels Jul 24, 2024

Enhance

a8ccd40

ZhennanQin reviewed Jul 26, 2024

View reviewed changes

lmontigny requested review from lmontigny and AndreyPavlenko August 13, 2024 13:21

lmontigny added this to the 0.1 CPU - General milestone Sep 5, 2024

niuxiaog mentioned this pull request Sep 10, 2024

Update openvino-mlir-gc integration slyalin/openvino#167

Open

Add integration with openvino.

e4f23c9

lmontigny approved these changes Sep 11, 2024

View reviewed changes

lmontigny requested a review from ZhennanQin September 11, 2024 12:29

niuxiaog mentioned this pull request Sep 24, 2024

[Transforms] Add constant_tensors_folding pass #74

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Constant tensors caching pass #183

[RFC] Constant tensors caching pass #183

Uh oh!

niuxiaog commented Jul 24, 2024

Uh oh!

ZhennanQin Jul 26, 2024

Uh oh!

niuxiaog Jul 26, 2024

Uh oh!

niuxiaog Jul 26, 2024

Uh oh!

ZhennanQin Jul 26, 2024

Uh oh!

niuxiaog Jul 26, 2024

Uh oh!

lmontigny commented Aug 20, 2024

Uh oh!

lmontigny left a comment

Uh oh!

niuxiaog commented Sep 13, 2024

Uh oh!

Uh oh!

[RFC] Constant tensors caching pass #183

Are you sure you want to change the base?

[RFC] Constant tensors caching pass #183

Uh oh!

Conversation

niuxiaog commented Jul 24, 2024

Uh oh!

ZhennanQin Jul 26, 2024

Choose a reason for hiding this comment

Uh oh!

niuxiaog Jul 26, 2024

Choose a reason for hiding this comment

Uh oh!

niuxiaog Jul 26, 2024

Choose a reason for hiding this comment

Uh oh!

ZhennanQin Jul 26, 2024

Choose a reason for hiding this comment

Uh oh!

niuxiaog Jul 26, 2024

Choose a reason for hiding this comment

Uh oh!

lmontigny commented Aug 20, 2024

Uh oh!

lmontigny left a comment

Choose a reason for hiding this comment

Uh oh!

niuxiaog commented Sep 13, 2024

Uh oh!

Uh oh!