This file contains description of the code used for Section 3 and 5 in the paper.
In Section 3, given the dataset and GNN performance metadata (as CSV format), we provide an example code (msglasso_example.R
) to run the sparse regression analysis, which is mainly taken from the R documentation provided by the authors of the original paper. We also provide the two CSV files we used in the analysis (Data Properties.csv
and Model Performance.csv
), which is metadata obtained from GLI library.
The R package MSGLasso is required.
The file msglasso_example.R
can be directly complied in RStudio.
In Section 5, we use GraphWorld toolbox to generate synthetic datasets with varying dataset properties and use GLI library to get GNNs' model performance. The pipeline can be reproduced via the Simulation_Study.ipynb
notebook. The parameters used by GraphWorld random dataset generator is included in the Appendix of the paper (also can be observed in the above notebok). For the training and model details, please see Appendix B of the paper.
For better reproducibility, we also include the synthetic datasets that we used to present the results in Table 5-8 in Appendix. In particular, for Gini-Degree experiment (Table 5), we use the datsets in all_data_gini.zip
; for Average Degree experiment (Table 6), we use datasets in all_data_deg.zip
; for Edge Homogeneity experiment (Table 7), we use datasets in all_data_homo.zip
; for In-Feature Similarity / Featre Angular SNR experiment (Table 8), we use datasets in all_data_var.zip
.
The above two libraries' README files contain sufficient information to install dependent packages and build up the repos. Our experiments do not require additional packages.
The notebook can be complied via Google Colab. Note that we do not provide dataloader to load the pre-generated datasets. The conclusion and discussion made in the paper are consistant if we use different random datasets generated with the same set of GraphWorld parameters.
If you find this repo helpful for your research, please consider citing our paper below.
@article{li2024metadata,
title={A metadata-driven approach to understand graph neural networks},
author={Li, Ting Wei and Mei, Qiaozhu and Ma, Jiaqi},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}