[Example] LLM (unmasked attention) optimization using allo #418

EzraReiss · 2025-09-16T22:20:29Z

Description

These folders contain the work Juhyoung and Ezra worked on over the summer. We designed an optimized MHA kernel utilizing allo scheduling primitives.

To add the llm work we've done

Created optimized attention kernel for FPGA architectures

Its just allo examples that can be used in the future or integrated with other transformer architecture kernels

summer work with Juhyoung

d77635d