diff --git a/README.md b/README.md index 37d2707e..c11368ab 100644 --- a/README.md +++ b/README.md @@ -1,23 +1,47 @@ -# Tilus: A Domain-Specific Language for High-Performance GPU Programming +# Tilus: A Tile-Level GPU Kernel Programming Language -**Tilus** is a domain-specific language (DSL) for GPU programming, designed with: +**Tilus** is a powerful domain-specific language (DSL) for GPU programming that offers: -* Thread-block-level granularity and tensors as the core data type -* Explicit control over shared memory and tensor layouts (unlike Triton) -* Support for low-precision types with arbitrary bit-widths +* **Thread-block-level granularity** with **tensors** as the primary data type. +* **Explicit control** over shared memory and register tensors (unlike Triton). +* **Low-precision types** with arbitrary bit-widths (1 to 8 bits). -Additional features include automatic tuning, caching, and a Pythonic interface for ease of use. +It also includes automatic tuning, caching, and a Pythonic interface for ease of use. -Tilus is proununced as tie-lus, /ˈtaɪləs/. +Tilus is pronounced as tie-lus, /ˈtaɪləs/. -Please cite the following paper if you use Tilus in your work: +## Getting Started + +### Installation +Install Tilus using `pip`: +``` +pip install tilus +``` + +### Usage + +To get started, refer to the [tutorials]() to learn how to program kernels with Tilus. + +You can also check more [examples](https://github.com/NVIDIA/tilus/tree/main/examples) of using Tilus. + +You can learn more on different topics in the [programming guide](). + +## Research +This project is based on the following research paper: ```bibtex @article{ding2025tilus, title={Tilus: A Virtual Machine for Arbitrary Low-Precision GPGPU Computation in LLM Serving}, - author={Ding, Yaoyao and Hou, Bohan and Zhang, Xiao and Lin, Allan and Chen, Tianqi and Hao, Cody Yu and Wang, Yida and Pekhimenko, Gennady}, + author={Ding, Yaoyao and Hou, Bohan and Zhang, Xiao and Lin, Allan and + Chen, Tianqi and Hao, Cody Yu and Wang, Yida and Pekhimenko, Gennady}, journal={arXiv preprint arXiv:2504.12984}, year={2025} } ``` +## Acknowledgement +We would like to acknowledge the following projects for their influence on Tilus's design and development: +- **Hidet**: We take Hidet IR as our low-level target and reuse its runtime system. +- **TVM**: Hidet's initial IR was adopted from TVM, and we also learned a lot from TVM on how to build a compiler. +- **Triton**: The core idea of defining kernels at a thread-block level and working with tiles was inspired by Triton. +- **Hexcute**: We adopted the idea of using automatic layout inference to simplify programming from Hexcute.