v0.1.1

Latest

Latest

yaoyaoding released this 02 Sep 19:45

· 48 commits to main since this release

3dea14f

This is a small enhancement release of Tilus.

Highlights

Add more examples: flash attention with kv-cache, flash linear attention deocde
Fix a bug when multiple tilus process access the dispatch table in cache
Add targets sm_100, sm_103, sm_110, sm_120 and sm_121.

What's Changed

[Docs] Update README.md by @yaoyaoding in #11
[CI] Use RTX 4090 for docs building by @yaoyaoding in #12
[Docs] Update README.md by @yaoyaoding in #13
[Package] Rename to under @NVIDIA organization by @nekomeowww in #15
[Docs] Update installation guide by @yaoyaoding in #17
[CI] Fix concurrency issue by @yaoyaoding in #18
[Docs] Correct gflops to tflops in examples by @YichengDWu in #19
[Example] Add the attention example with kv-cache by @yaoyaoding in #21
[Example] Add example for decoding kernel of flash linear attention by @yaoyaoding in #25
[Example] Add a kernel in the flash linear attention by @yaoyaoding in #26
[Example] Add the fused kernel for decoding of flash linear attention by @yaoyaoding in #27
[Tuning] Add lock to cache dir when dump the tuning result by @yaoyaoding in #28
[Target] Add targets properties by @yaoyaoding in #29
[Bump] Bump version of hidet from 0.6.0 to 0.6.1 by @yaoyaoding in #30

New Contributors

@nekomeowww made their first contribution in #15
@YichengDWu made their first contribution in #19

Full Changelog: v0.1...v0.1.1

Contributors

NVIDIA, nekomeowww, and 2 other contributors

Assets 2