[midend] Add risc version gemmini matmul pass #492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

cmx-Y wants to merge 1 commit into buddy-compiler:main from cmx-Y:risc-pass

Contributor

cmx-Y commented May 9, 2025

This PR aims to reproduce the methods proposed in the Exo paper to enable matrix multiplication acceleration based on Gemmini.

This pass has already implemented optimizations such as tile_outer_loops, fission_inner_blocks, and replace_gemmini_calls as described in Exo. Support for lift_config and on-chip memory management will be added in the future.
Currently, the pass only supports input dimensions that are multiples of 16; tail cases are not yet handled.
During the development of this pass, we observed that the LegalizeForLLVM phase requires generating ConstantOp, so the mvin operation does not support dynamic shapes (e.g., memref). Moreover, when the shape exceeds 16, the existing pipeline is likely to encounter bugs. In contrast, this pass is capable of handling dynamic shapes properly.


          [midend] Add risc version gemmini matmul pass

c2fc8a9

linuxlonelyeagle suggested changes

View reviewed changes

Member

linuxlonelyeagle left a comment

I left a few brief comments.

examples/GemminiDialect/risc_matmul.mlir

    
                linalg.fill

                  ins(%1 : i8)

                outs(%mem1 : memref<32x32xi8>)

                linalg.matmul

Member

linuxlonelyeagle Aug 28, 2025

Please add the content to be check.

midend/lib/Conversion/MatMulOptimization/CMakeLists.txt

    
              	MatMulOptimize.cpp

                MatMulVectorization.cpp

                MatMulGemmini.cpp

                MatMulParallelVectorization.cpp

Member

linuxlonelyeagle Aug 28, 2025

I think we can rearrange the C++ files here. Use a dictionary sort.

midend/lib/Conversion/MatMulOptimization/MatMulGemmini.cpp

    
                  ShapedType ATy = A.getType().cast<ShapedType>();

                  Type eleTy = ATy.getElementType();

                  ShapedType BTy = B.getType().cast<ShapedType>();

                  // ShapedType CTy = C.getType().cast<ShapedType>();

Member

linuxlonelyeagle Aug 28, 2025

Can we delete the content in the comments?

midend/lib/Conversion/MatMulOptimization/MatMulGemmini.cpp

    
                  Value B = op->getOperand(1);

                  Value C = op->getOperand(2);

                  // Get shape of input and output

                  ShapedType ATy = A.getType().cast<ShapedType>();

Member

linuxlonelyeagle Aug 28, 2025

use miler::cast(A.getType())

midend/lib/Conversion/MatMulOptimization/MatMulGemmini.cpp

    
                  ShapedType BTy = B.getType().cast<ShapedType>();

                  // ShapedType CTy = C.getType().cast<ShapedType>();

                  auto ctx = op->getContext();

Member

linuxlonelyeagle Aug 28, 2025

It's best not to use auto.

midend/lib/Conversion/MatMulOptimization/MatMulGemmini.cpp

    
                  auto ctx = op->getContext();

                  // Some constants.

                  const Value c0 =

                      rewriter.create<arith::ConstantOp>(loc, rewriter.getIndexAttr(0));

Member

linuxlonelyeagle Aug 28, 2025

Do not use constants with Value.

midend/lib/Conversion/MatMulOptimization/MatMulGemmini.cpp

    
                  // This algorithm does not use the column A index.

                  // const Value aCol = rewriter.create<memref::DimOp>(loc, A, c1);

                  const Value bRow = rewriter.create<memref::DimOp>(loc, B, c0);

                  const Value bCol = rewriter.create<memref::DimOp>(loc, B, c1);

Member

linuxlonelyeagle Aug 28, 2025

The same applies here.

midend/lib/Conversion/MatMulOptimization/MatMulGemmini.cpp

    
                  auto forI = rewriter.create<affine::AffineForOp> (

                      loc, 0, rewriter.getIndexAttr(ATy.getDimSize(0)).getInt(), 1, std::nullopt);

                  // auto forI = rewriter.create<affine::AffineForOp>(

                  //     loc, ValueRange{c0}, rewriter.getDimIdentityMap(),

Member

linuxlonelyeagle Aug 28, 2025

Can we delete the content here?

midend/lib/Conversion/MatMulOptimization/MatMulGemmini.cpp

    
                  SmallVector<affine::AffineForOp, 6> band;

                  band.push_back(forI);

                  band.push_back(forJ);

                  band.push_back(forK);

Member

linuxlonelyeagle Aug 28, 2025

SmallVector<affine::AffineForOp, 6> band = {init,}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet