A Metadata-Driven Approach to Understand Graph Neural Networks

This file contains description of the code used for Section 3 and 5 in the paper.

Section 3: Multivariate Sparse Group Lasso

In Section 3, given the dataset and GNN performance metadata (as CSV format), we provide an example code (msglasso_example.R) to run the sparse regression analysis, which is mainly taken from the R documentation provided by the authors of the original paper. We also provide the two CSV files we used in the analysis (Data Properties.csv and Model Performance.csv), which is metadata obtained from GLI library.

Requirements

The R package MSGLasso is required.

Run the Code

The file msglasso_example.R can be directly complied in RStudio.

Section 5: Controlled Experiments

In Section 5, we use GraphWorld toolbox to generate synthetic datasets with varying dataset properties and use GLI library to get GNNs' model performance. The pipeline can be reproduced via the Simulation_Study.ipynb notebook. The parameters used by GraphWorld random dataset generator is included in the Appendix of the paper (also can be observed in the above notebok). For the training and model details, please see Appendix B of the paper.

For better reproducibility, we also include the synthetic datasets that we used to present the results in Table 5-8 in Appendix. In particular, for Gini-Degree experiment (Table 5), we use the datsets in all_data_gini.zip ; for Average Degree experiment (Table 6), we use datasets in all_data_deg.zip ; for Edge Homogeneity experiment (Table 7), we use datasets in all_data_homo.zip; for In-Feature Similarity / Featre Angular SNR experiment (Table 8), we use datasets in all_data_var.zip.

Requirements

The above two libraries' README files contain sufficient information to install dependent packages and build up the repos. Our experiments do not require additional packages.

Run the Code

The notebook can be complied via Google Colab. Note that we do not provide dataloader to load the pre-generated datasets. The conclusion and discussion made in the paper are consistant if we use different random datasets generated with the same set of GraphWorld parameters.

Citation

If you find this repo helpful for your research, please consider citing our paper below.

@article{li2024metadata,
  title={A metadata-driven approach to understand graph neural networks},
  author={Li, Ting Wei and Mei, Qiaozhu and Ma, Jiaqi},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Metadata-Driven Approach to Understand Graph Neural Networks

Section 3: Multivariate Sparse Group Lasso

Requirements

Run the Code

Section 5: Controlled Experiments

Requirements

Run the Code

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Data Properties.csv		Data Properties.csv
Model Performance.csv		Model Performance.csv
README.md		README.md
Simulation_Study.ipynb		Simulation_Study.ipynb
all_data_deg.zip		all_data_deg.zip
all_data_gini.zip		all_data_gini.zip
all_data_homo.zip		all_data_homo.zip
all_data_var.zip		all_data_var.zip
msglasso_example.R		msglasso_example.R

TRAIS-Lab/metadata-GNN

Folders and files

Latest commit

History

Repository files navigation

A Metadata-Driven Approach to Understand Graph Neural Networks

Section 3: Multivariate Sparse Group Lasso

Requirements

Run the Code

Section 5: Controlled Experiments

Requirements

Run the Code

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages