Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 33 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,47 @@
# Tilus: A Domain-Specific Language for High-Performance GPU Programming
# Tilus: A Tile-Level GPU Kernel Programming Language

**Tilus** is a domain-specific language (DSL) for GPU programming, designed with:
**Tilus** is a powerful domain-specific language (DSL) for GPU programming that offers:

* Thread-block-level granularity and tensors as the core data type
* Explicit control over shared memory and tensor layouts (unlike Triton)
* Support for low-precision types with arbitrary bit-widths
* **Thread-block-level granularity** with **tensors** as the primary data type.
* **Explicit control** over shared memory and register tensors (unlike Triton).
* **Low-precision types** with arbitrary bit-widths (1 to 8 bits).

Additional features include automatic tuning, caching, and a Pythonic interface for ease of use.
It also includes automatic tuning, caching, and a Pythonic interface for ease of use.

Tilus is proununced as tie-lus, /ˈtaɪləs/.
Tilus is pronounced as tie-lus, /ˈtaɪləs/.

Please cite the following paper if you use Tilus in your work:
## Getting Started

### Installation
Install Tilus using `pip`:
```
pip install tilus
```

### Usage

To get started, refer to the [tutorials]() to learn how to program kernels with Tilus.

You can also check more [examples](https://github.com/NVIDIA/tilus/tree/main/examples) of using Tilus.

You can learn more on different topics in the [programming guide]().

## Research
This project is based on the following research paper:

```bibtex
@article{ding2025tilus,
title={Tilus: A Virtual Machine for Arbitrary Low-Precision GPGPU Computation in LLM Serving},
author={Ding, Yaoyao and Hou, Bohan and Zhang, Xiao and Lin, Allan and Chen, Tianqi and Hao, Cody Yu and Wang, Yida and Pekhimenko, Gennady},
author={Ding, Yaoyao and Hou, Bohan and Zhang, Xiao and Lin, Allan and
Chen, Tianqi and Hao, Cody Yu and Wang, Yida and Pekhimenko, Gennady},
journal={arXiv preprint arXiv:2504.12984},
year={2025}
}
```

## Acknowledgement
We would like to acknowledge the following projects for their influence on Tilus's design and development:
- **Hidet**: We take Hidet IR as our low-level target and reuse its runtime system.
- **TVM**: Hidet's initial IR was adopted from TVM, and we also learned a lot from TVM on how to build a compiler.
- **Triton**: The core idea of defining kernels at a thread-block level and working with tiles was inspired by Triton.
- **Hexcute**: We adopted the idea of using automatic layout inference to simplify programming from Hexcute.