Welcome to the Ghost Threading Compiler project!
This repository contains the source code for an LLVM-based compiler that enables Ghost Threading — a software-only prefetching mechanism that utilizes idle Simultaneous Multithreading (SMT) contexts to launch lightweight helper threads. The technique is described in the paper:
Ghost Threading: Helper-Thread Prefetching for Real Systems
Yuxin Guo, Alexandra W. Chadwick, Márton Erdős,, Utpal Bora, Akshay Bhosale, Giacomo Gabrielli and Timothy M. Jones
International Symposium on Microarchitecture (MICRO)
October 2025
Please cite this paper if you produce any work that uses this repository.
Consult the Getting Started with LLVM page for information on building and running LLVM.
You can install the required dependencies using the following commands:
sudo apt-get update
sudo apt-get install python3 llvm clang lld ninja-build cmake
brew install llvm lld python ninja cmake
- Clone the repository:
git clone https://github.com/CompArchCam/GhostThreadingCompiler.git
- Build the compiler
cd GhostThreadingCompiler
mkdir build && cd build
export LLVMDIR="/llvm/install/path"
cmake -G Ninja \
-DCMAKE_INSTALL_PREFIX="${LLVMDIR}" \
-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_OPTIMIZED_TABLEGEN=On \
-DLLVM_ENABLE_PROJECTS="clang;lld;openmp" \
-DLLVM_TARGETS_TO_BUILD="X86" \
-DLLVM_PARALLEL_COMPILE_JOBS=6 \
-DLLVM_PARALLEL_LINK_JOBS=4 \
-DLLVM_USE_LINKER=lld \
../llvm
ninja install
export PATH="${LLVMDIR}/bin:$PATH"
Once installed, you can verify the installation by running the following command:
opt --help-hidden | grep ghostthreading
clang --version
This should print the version of the compiler that you have installed.
The compiler is pragma driven and an expensive memory load in a loop can be annotated with prefetch intrinsic as show below:
// test.c
#include<stdio.h>
extern int foo(int);
extern int **Data;
extern unsigned Length;
int main() {
long int Sum = 0;
#pragma ghost_threading sync_frequency(14) skip_iter(8) serial_max_threshold(100) serial_min_threshold(10)
for(unsigned i = 0; i < Length; i++) {
__builtin_prefetch(Data[i+64]);
Sum += foo(*Data[i]);
}
printf("Sum %ld\n", Sum);
}
The hyper-parameters are described in detail in the paper and must be tuned for each workload and target machine to achieve optimal performance.
The Ghost Threading pass is enabled by defaul but can be enabled/disabled with the command line flag -ghostthreading=[true|false]
.
clang -O3 -w -Wall -std=c11 \
-mtune=native -march=native \
-mllvm -ghostthreading=true \
test.c -o test.out
The benchmarks or workloads used to evaluate this automated technique can be fetched from
the repository ghost-threading-bmk.
These are workloads from gap
and htpf
as described in the aforementioned MICRO paper.
The scripts to compile and execute the baseline, automatic Ghost Threading, and manual Ghost Threading
technique can be found in the directory workdir
.
The script config.sh
sets the relevant flags for each of the techniques.
This work was supported by the Engineering and Physical Sciences Research Council (EPSRC), grant EP/W00576X/1, and Arm. Additional data related to this publication is available in the repository at url.
We welcome contributions from the community! If you want to improve the project or add new features, follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature-name)
. - Implement your feature or bug fix.
- Run the benchmarks and ensure all tests pass.
- Commit your changes and push them to your fork.
- Create a pull request describing your changes.