All code and data used in the HyperMR paper are open sourced here.
-
Required system configuration:
Ubuntu 18.04.5 LTS
,Linux Kernel 4.15.0-167-generic
.
-
Required libraries and tools:
g++
(>= 9),libboost-all-dev
(> 1.69)
-
Based on open-source tools:
All data can be found in the test/datasets
directory.
The SuiteSparse Matrix Collection (formerly the University of Florida Sparse Matrix Collection) is a widely used set of sparse matrix benchmarks collected from a wide range of applications. Since the sparse matrix files are too large, we only upload their compressed files in the github repository, and the sparse matrix files required for the experiment can be generated by the following shell command.
bash decompression.sh
The SDSS astronomical images used in the experiments were obtained by using SQL queries in SkyServer. We use the SQL statement "SELECT TOP 20 specobjid as name, ra, dec FROM SpecObj" to query 20 Spectro objects in SkyServer (SQL Search), and then use the Finding Chart toolbox to export all these Spectro objects as JPEG images. In the Finding Chart toolbox, we set the image to have a height of 4096 and a width of 4096, a scale of 0.5, and use the "Invert Image" option. In order to view all the Spectro objects in bulk, you can input the same SQL statement through Image List toolbox to get the thumbnails of Spectro objects. To save storage space, we upload only the raw astronomical image data, then we use sdss.py to transform the SDSS images into sparse matrices (.mtx files), please use the following command to generate datasets.
python sdss.py
- Matrix-vector multiplication (MVM)
- Synthetic workloads
- Real-world workload (Gaussian Smoothing)
Clone this repository, and then update the submodules of KaHyPar:
cd kahypar/external_tools
git clone [email protected]:google/googletest.git
git clone [email protected]:larsgottesbueren/WHFC.git
git clone [email protected]:pybind/pybind11.git
Compile KaHyPar:
cd ..
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=RELEASE && make -j
sudo make install
Update LD_LIBRARY_PATH:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/
Build datasets:
cd HyperMR/test/datasets
sh decompression.sh
Compile HyperMR:
cd HyperMR
mkdir build && cd build
cmake .. && make -j
Run tests:
./HyperMR
Then, you can enter the serial number of the type of experiment you want to test. For example, if you enter 1, the MVM evaluation will be performed. Refer to the configuration file in test/config
for the configuration of each experiment.
HyperMR:
Experiment 1: MVM.
Experiment 2: Synthetic Workloads.
Experiment 3: Gaussian Smoothing.
Experiment 4: Baselines.
Please enter the experiment ID:
Note:
-
Experiment results are saved in
test/output
andtest/log
. By default, the experiment results are not saved (config:output-file: No
), becausetest/output
directory now holds the reordered row and column order of HyperMR for all experiment datasets. -
HyperMR iteratively searches for higher-quality solutions, meaning that extended runtime generally leads to better results. Consequently, a time limit must be imposed to achieve high-quality solutions within a defined period. To reproduce the experimental results, please use the time-limit configuration that can be found at
test/config/time_limit_config.txt
, the default time-limit is-1
.