Skip to content

Conversation

@Jiajun-Ji
Copy link
Contributor

Fix AMX matmul B tile VNNI layout and rewrite conversion pass.
Enable hardware-accelerated matrix multiplication for DeepSeek-R1 using Intel AMX instructions with proper VNNI layout conversion.

Properly pack B matrix into VNNI format with interleaved rows for correct AMX BF16 tile operations.
Enable hardware-accelerated matrix multiplication for DeepSeek-R1 using Intel AMX instructions with proper VNNI layout conversion.
This directory was added accidentally and should not be part of the project.
…nalysis in AMX matmul test for precision loss comparison
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant