Exact and Sampling Methods for Mining Higher-Order Motifs in Large Hypergraphs

This code implements the different algorithms for higher-order motif analysis proposed in the paper Exact and Sampling Methods for Mining Higher-Order Motifs in Large Hypergraphs.

Abstract

Network motifs are patterns of interactions occurring among a small set of nodes in a graph. They highlight fundamental aspects of the interplay between the topology and the dynamics of complex networks and have a wide range of real-world applications. Motif analysis has been extended to a variety of network models that allow for a richer description of the interactions of a system, including weighted, temporal, multilayer, and, more recently, higher-order networks. Generalizing network motifs to capture patterns of group interactions is not only interesting from the fundamental perspective of understanding complex systems, but also proposes unprecedented computational challenges. In this work, we focus on the problem of counting occurrences of sub-hypergraph patterns in very large higher-order networks. We show that, by directly exploiting higher-order structures, we speed up the counting process compared to applying traditional data mining techniques for network motifs. Moreover, by including hyperedge sampling techniques, computational complexity is further reduced at the cost of small errors in the estimation of motif frequency. We evaluate our algorithms on several real-world datasets describing face-to-face interactions, co-authorship and human communication. We show that our approximated algorithm not only allows to speed up the performance, but also to extract larger higher-order motifs beyond the computational limits of an exact approach.

Code organization

rand-motifs.py contains the implementation of the sampling algorithm proposed in the paper
motifs.py contains the implementation of the baseline algorithm proposed in the paper
motif2.py contains the implementation of the efficient algorithm proposed in the paper
utils.py contains some useful functions
loaders.py contains the loader for the datasets (see section below)
hypergraph.py contains the implementation of a data structure for hypergraphs in Python and the configuration model for hypergraphs proposed by Phil Chodrow

NB: as of now there are other files that are useful for paper visualizations and analyses, they will be removed at publication time.

Datasets

Please download the datasets here and extract the archive inside the main directory.

How to use custom datasets

If you wish to perform higher-order motif analysis on your own datasets, you should implement a custom loader function. This function should return a set of tuples. Each tuple represents an hyperedge, and will contain the ids of the nodes involved in a group interactions.

How to perform higher-order motif analysis

If you wish to experiment with the code, you can run analysis setting up the parameter N in the code, which specifies the order of the motifs to use for the analysis. At the moment, the only feasible orders for exact algorithms are N=3 and N=4. The parameter RUN_CONFIG_MODEL specifies the number of samples from the configuration model. Keep in mind that RUN_CONFIG_MODEL heavily affects the performance. A value between 10 and 20 already gives reliable results. For the sampling algorithm, the parameter S controls the number of hyperedges to sample from the hypergraphs. Values of S between 1000 and 10000 give good results for co-authorship data, values of S between 100 and 1000 give good results for social data. The sampling algorithm allows for the analysis of motifs of size N=3, N=4 and N=5.

Acknowledgments

We thank Phil Chodrow for the code in the file hypergraph.py

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.DS_Store		.DS_Store
README.md		README.md
hypergraph.py		hypergraph.py
loaders.py		loaders.py
metrics.py		metrics.py
motifs.py		motifs.py
motifs2.py		motifs2.py
rand-motifs.py		rand-motifs.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exact and Sampling Methods for Mining Higher-Order Motifs in Large Hypergraphs

Abstract

Code organization

Datasets

How to use custom datasets

How to perform higher-order motif analysis

Acknowledgments

About

Releases

Packages

Languages

FraLotito/sampling-motifs

Folders and files

Latest commit

History

Repository files navigation

Exact and Sampling Methods for Mining Higher-Order Motifs in Large Hypergraphs

Abstract

Code organization

Datasets

How to use custom datasets

How to perform higher-order motif analysis

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages