Kaidi Cao, Phitchaya Mangpo Phothilimthana, Sami Abu-El-Haija, Dustin Zelle, Yanqi Zhou, Charith Mendis, Jure Leskovec, Bryan Perozzi
This is the implementation of GST+EFD in the paper Learning Large Graph Property Prediction via Graph Segment Training in PyTorch.
The codebase is developed based on GraphGPS. Installing the environment follwoing its instructions.
- MalNet, the split info of Malnet-Large is provided in splits folder.
- TpuGraphs.
We provide several training examples with this repo:
python main.py --cfg configs/malnetlarge-GST.yaml
For TpuGraphs dataset, download the dataset following instructions here, by default, put the train/valid/test
splits under the folder ./datasets/TPUGraphs/raw/npz/layout/xla/random
. To run on other collections, modify source
and search
in in tpu_graphs.py.
You can train by invoking:
python main_tpugraphs.py --cfg configs/tpugraphs.yaml
Please change device
from cuda
to cpu
in the yaml file if you want to try cpu only training.
To evaluate on TpuGraphs dataset, run
python test_tpugraphs.py --cfg configs/tpugraphs.yaml
If memory is not sufficient, change batch_size
to 1 during evaluation. Set cfg.train.ckpt_best
to True
to save the best validation model during training for further evaluation.
To create your own custom model, you can supply a configuration (e.g., by copying configs/tpugraphs.yaml) and set the attribute type
(inside of model
) to some string that you register in network/custom_tpu_gnn.py.
If you find our paper and repo useful, please cite as
@article{cao2023learning,
title={Learning Large Graph Property Prediction via Graph Segment Training},
author={Cao, Kaidi and Phothilimthana, Phitchaya Mangpo and Abu-El-Haija, Sami and Zelle, Dustin and Zhou, Yanqi and Mendis, Charith and Leskovec, Jure and Perozzi, Bryan},
journal={arXiv preprint arXiv:2305.12322},
year={2023}
}