With terabyte-scale memory capacity and memory-intensive workloads, memory translation has become a major performance bottleneck. Many novel hardware schemes are developed to speed up memory translation, but few are experimented with commodity OSes. A main reason is that memory management in major OSes, like Linux, does not have the extensibility to empower emerging hardware schemes.
We develop EMT, a pragmatic framework atop Linux to em- power different hardware schemes of memory translation such as radix tree and hash table. EMT provides an architecture- neutral interface that 1) supports diverse memory translation architectures, 2) enables hardware-specific optimizations, 3) accommodates modern hardware and OS complexity, and 4) has negligible overhead over hardwired implementations. We port Linux’s memory management onto EMT and show that EMT enables extensibility without sacrificing performance. We use EMT to implement OS support for ECPT and FPT, two recent experimental translation schemes for fast translation; EMT enables us to understand the OS perspective of these architectures and further optimize their designs.
- emt-linux, Linux kernel implementation including support for x86-radix, ECPT, and FPT
- qemu-emt, QEMU emulation tool to test and evaluate architectures
- dynamorio Memory simulator for performance
- VM-Bench benchmark repo for all benchmarks
- collect_data.sh one click script to collect simulation data.
- analyze_data.sh one click script to generate plots. It uses
ecpt_unified.py
ipc_with_inst.py
andkern-inst-breakdown-with-khuge-unified.py
.
- Simulation experiment: We reserved machines on cloudlab. You can join the project
AE25
and find experimentOSDI2025-EMT-AE
.
The following instructions are tested on Ubuntu 22.04 environment.
git clone https://github.com/xlab-uiuc/EMT-OSDI-AE.git
cd EMT-OSDI-AE
./setup/install_dependency.sh
PS: ./setup/install_dependency.sh
installs necessary packages for compilation and docker environment. Please relogin or run.
newgrp docker
to make docker group active.
# Clone QEMU repo and build QEMU with x86-radix softmmu
./setup/setup_qemu_radix.sh
The compiled binary is at qemu-radix/build/qemu-system-x86_64
.
# Clone Linux repo and build EMT-Linux with x86-radix MMU driver
./setup/setup_linux_radix.sh
You can find linux folder at emt-linux-radix
.
To run linux with QEMU, you need a filesystem image.
We prepared a image that contains precompiled benchmark suites (from VM-Bench).
We have already uploaded image to /proj/ae25-PG0/EMT-images/image_record_loading.ext4.xz
.
Run the following commands to copy and decoompress. (Please make sure we are still in the EMT-OSDI-AE
root folder.)
# copy and decompress the image.
cp /proj/ae25-PG0/EMT-images/image_record_loading.ext4.xz .
unxz image_record_loading.ext4.xz
The script will setup dynamorio
and VM-Bench
repo.
They contain simulator and instruction trace analyzer.
./setup/setup_simulator.sh
Run a graphbig benchmark with EMT-Linux and generate analysis file. The output include instruction and memory trace which can be up to 150GB. The script will compress the trace to save space in the end. Please select a disk drive with at least 200GB free space.
Note If you are on a cloudlab machine, the home directory will likely not have enough space to finish the workload.
We provide a guide to mount an extra disk under directory /data/EMT
Note that the script assuems image is located ../image_record_loading.ext4
Please change the IMAGE_PATH_PREFIX
in ./run_bench.sh
accordingly, if you have renamed the image.
cd emt-linux-radix
# dry run to print the command to execute.
# Double check architecture, thp config, image path, output directory
./run_bench.sh --arch radix --thp never --out /data/EMT --dry
# real run
./run_bench.sh --arch radix --thp never --out /data/EMT
The script will run and generate analysis files in /data/EMT/radix/running
.
You can find a file with suffix kern_inst.folded.high_level.csv
that contains kernel instruction distribution.
File ended with bin.dyna_asplos_smalltlb_config_realpwc.log
is simulator result from DynamoRIO; the final simulation result can be found at the file ended with dyna_asplos_smalltlb_config_realpwc.log.ipc.csv
Note that if you just ended running command above, please return back to the root folder (EMT-OSDI-AE
).
# Clone QEMU repo and build with ECPT softmmu support
./setup/setup_qemu_ecpt.sh
The compiled binary is still at qemu-ecpt/build/qemu-system-x86_64
.
The name qemu-system-x86_64
might be confusing here,
but we have changed the implementation of QEMU's x86_64 to support ECPT,
so you are actually configure to compile x86_64 but with ECPT as address translation method.
The script will setup linux directory at emt-linux-ecpt
.
# Clone Linux repo and build EMT-Linux with ECPT MMU driver
./setup/setup_linux_ecpt.sh
Again, we need a filesystem to run. We can reuse the image from last section.
cd emt-linux-ecpt
# dry run to print the command to execute.
# Double check architecture, thp config, image path, output directory
./run_bench.sh --arch ecpt --thp never --out /data/EMT --dry
# real run
./run_bench.sh --arch ecpt --thp never --out /data/EMT
You can find similar files in /data/EMT/ecpt/running
.
# Clone QEMU repo and build with FPT softmmu support
./setup/setup_qemu_fpt.sh
Output folder qemu-fpt
.
# Clone Linux repo and build EMT-Linux with FPT (L4L3 L2L1) MMU driver
./setup/setup_linux_fpt_L4L3L2L1.sh
Output folder emt-linux-fpt-L4L3L2L1
.
cd emt-linux-fpt-L4L3L2L1
# dry run to print the command to execute.
# Double check architecture, thp config, image path, output directory
./run_bench.sh --arch fpt --flavor L4L3_L2L1 --thp never --out /data/EMT --dry
# real run
./run_bench.sh --arch fpt --flavor L4L3_L2L1 --thp never --out /data/EMT
By default FPT runs with L4L3 and L2L1 flatenned. If you wish to try L3L2 folding.
# Clone Linux repo and build EMT-Linux with FPT (L4L3 L2L1) MMU driver
./setup/setup_linux_fpt_L3L2.sh
Output folder emt-linux-fpt-L3L2
.
Then run benchmark with
# real run
cd emt-linux-fpt-L3L2;
./run_bench.sh --arch fpt --flavor L3L2 --thp never --out /data/EMT
We aim to validate the following claims:
- EMT-ECPT will lead to more (> 1.2x) kernel overheads in both 4KB and THP setup.
- EMT-ECPT will have positive (> 1.0x) page walk latency speedup.
- EMT-ECPT will have positive (> 1.0x) IPC speedup.
- EMT-ECPT will positive but limited (> 1.0x but < 1.1x) total cycle reductions.
Due to the long time to simulate all the benchmarks,
we provide a script to run six representative benchmarks: graphbig_bfs
, graphbig_dfs
, graphbig_dc
, graphbig_sssp
, gups
, and redis
.
It collects data for two archictures (radix/ECPT) at two THP configurations (4KB/THP) and two application stages (running/loading)
Note (MUST READ):
- The following experiment will take about 3 - 4 days to finish, please run it ahead of time.
- Please run with tmux to avoid the script from being killed.
- Please make sure
/data
is mounted with at least 500GB space. One click run:
# Usage: ./collect_data.sh [--dry] [--out <destination>]
./collect_data.sh
You can use --dry
flag to see the commands without executing them; you can also use --out
to specify a custom directory to store the data.
By default, benchmarks are executed sequentially. If your system has sufficient CPU, memory, and storage resources, you may modify the script to introduce parallelism for faster data collection.
We provide a script to analyze the collected benchmark data.
# Usage: ./analyze_data.sh [--dry] [--thp never|always|all] [--input <source directory>] [--ipc_stats <pos>] [--inst_stats <pos>] [--graph <pos>]
./analyze_data.sh
Optional flags:
--dry
Print the commands that would be executed without actually running them.--thp
Specify which THP configuration(s) to analyze. Useall
to process both.--input
Set the root directory containing raw benchmark results (default:/data/EMT
).--ipc_stats
Customize the output directory for IPC statistics (default:./ipc_stats
).--inst_stats
Customize the output directory for instruction statistics (default:./inst_stats
).--graph
Set the destination folder for all generated plots (default:./graph
).
This script will generate statistics of the benchmarks, in csv
format, and generate visualizations for them.
You can find the plots under ./graph
directory.
./graph/kern_inst_unified_never.pdf
corresponds to Figure 16 a)./graph/kern_inst_unified_always.pdf
corresponds to Figure 16 a)./graph/ecpt_pgwalk_never.svg
corresponds to Figure 20 a)./graph/ecpt_ipc_never.svg
corresponds to Figure 20 b)./graph/ecpt_e2e_never.svg
corresponds to Figure 20 c)