Note that these are the results for models within kgcnn
implementation, and that training is not always done with optimal hyperparameter or splits, when comparing with literature.
This table is generated automatically from keras history logs.
Model weights and training statistics plots are not uploaded on
github
due to their file size.
Max. or Min. denotes the best test error observed for any epoch during training.
To show overall best test error run python3 summary.py --min_max True
.
If not noted otherwise, we use a (fixed) random k-fold split for validation errors.
ClinTox (MoleculeNet) consists of 1478 compounds as smiles and data of drugs approved by the FDA and those that have failed clinical trials for toxicity reasons. We use random 5-fold cross-validation. The first label 'approved' is chosen as target.
model | kgcnn | epochs | Accuracy | AUC(ROC) | Max. Accuracy | Max. AUC |
---|---|---|---|---|---|---|
DMPNN | 4.0.0 | 50 | 0.9480 ± 0.0138 | 0.8297 ± 0.0568 | 0.9594 ± 0.0071 | 0.8928 ± 0.0301 |
GAT | 4.0.0 | 50 | 0.9480 ± 0.0070 | 0.8512 ± 0.0468 | 0.9561 ± 0.0077 | 0.8740 ± 0.0436 |
GATv2 | 4.0.0 | 50 | 0.9372 ± 0.0155 | 0.8587 ± 0.0754 | 0.9581 ± 0.0102 | 0.8915 ± 0.0539 |
GCN | 4.0.0 | 50 | 0.9432 ± 0.0155 | 0.8555 ± 0.0593 | 0.9574 ± 0.0082 | 0.8876 ± 0.0378 |
GIN | 4.0.0 | 50 | 0.9412 ± 0.0034 | 0.8066 ± 0.0636 | 0.9567 ± 0.0102 | 0.8634 ± 0.0482 |
GraphSAGE | 4.0.0 | 100 | 0.9412 ± 0.0073 | 0.8013 ± 0.0422 | 0.9547 ± 0.0076 | 0.8933 ± 0.0411 |
Schnet | 4.0.0 | 50 | 0.9277 ± 0.0102 | 0.6562 ± 0.0760 | 0.9392 ± 0.0125 | 0.7721 ± 0.0510 |
Cora Dataset of 19793 publications and 8710 sparse node attributes and 70 node classes. Here we use random 5-fold cross-validation on nodes.
model | kgcnn | epochs | Categorical accuracy | Max. Categorical accuracy |
---|---|---|---|---|
DMPNN | 4.0.0 | 300 | 0.2476 ± 0.1706 | 0.2554 ± 0.1643 |
GAT | 4.0.0 | 250 | 0.6157 ± 0.0071 | 0.6331 ± 0.0089 |
GATv2 | 4.0.0 | 1000 | 0.6211 ± 0.0048 | 0.6383 ± 0.0079 |
GCN | 4.0.0 | 300 | 0.6232 ± 0.0054 | 0.6307 ± 0.0061 |
GIN | 4.0.0 | 800 | 0.6263 ± 0.0080 | 0.6323 ± 0.0087 |
GraphSAGE | 4.0.0 | 600 | 0.6151 ± 0.0053 | 0.6431 ± 0.0027 |
Cora Dataset after Lu et al. (2003) of 2708 publications and 1433 sparse attributes and 7 node classes. Here we use random 5-fold cross-validation on nodes.
model | kgcnn | epochs | Categorical accuracy | Max. Categorical accuracy |
---|---|---|---|---|
DMPNN | 4.0.0 | 300 | 0.8357 ± 0.0156 | 0.8545 ± 0.0181 |
GAT | 4.0.0 | 250 | 0.8397 ± 0.0122 | 0.8512 ± 0.0147 |
GATv2 | 4.0.0 | 250 | 0.8331 ± 0.0104 | 0.8427 ± 0.0120 |
GCN | 4.0.0 | 300 | 0.8072 ± 0.0109 | 0.8497 ± 0.0149 |
GIN | 4.0.0 | 500 | 0.8279 ± 0.0170 | 0.8335 ± 0.0176 |
GraphSAGE | 4.0.0 | 500 | 0.8497 ± 0.0100 | 0.8741 ± 0.0115 |
ESOL consists of 1128 compounds as smiles and their corresponding water solubility in log10(mol/L). We use random 5-fold cross-validation.
model | kgcnn | epochs | MAE [log mol/L] | RMSE [log mol/L] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
AttentiveFP | 4.0.0 | 200 | 0.4351 ± 0.0110 | 0.6080 ± 0.0207 | 0.4023 ± 0.0185 | 0.5633 ± 0.0328 |
CMPNN | 4.0.0 | 600 | 0.5276 ± 0.0154 | 0.7505 ± 0.0189 | 0.4681 ± 0.0107 | 0.6351 ± 0.0182 |
DGIN | 4.0.0 | 300 | 0.4434 ± 0.0252 | 0.6225 ± 0.0420 | 0.4247 ± 0.0180 | 0.5980 ± 0.0277 |
DMPNN | 4.0.0 | 300 | 0.4401 ± 0.0165 | 0.6203 ± 0.0292 | 0.4261 ± 0.0118 | 0.5968 ± 0.0211 |
EGNN | 4.0.0 | 800 | 0.4507 ± 0.0152 | 0.6563 ± 0.0370 | 0.4209 ± 0.0129 | 0.5977 ± 0.0444 |
GAT | 4.0.0 | 500 | 0.4818 ± 0.0240 | 0.6919 ± 0.0694 | 0.4550 ± 0.0230 | 0.6491 ± 0.0591 |
GATv2 | 4.0.0 | 500 | 0.4598 ± 0.0234 | 0.6650 ± 0.0409 | 0.4372 ± 0.0217 | 0.6217 ± 0.0450 |
GCN | 4.0.0 | 800 | 0.4613 ± 0.0205 | 0.6534 ± 0.0513 | 0.4405 ± 0.0277 | 0.6197 ± 0.0602 |
GIN | 4.0.0 | 300 | 0.5369 ± 0.0334 | 0.7954 ± 0.0861 | 0.4967 ± 0.0159 | 0.7332 ± 0.0647 |
GNNFilm | 4.0.0 | 800 | 0.4854 ± 0.0368 | 0.6724 ± 0.0436 | 0.4669 ± 0.0317 | 0.6488 ± 0.0370 |
GraphSAGE | 4.0.0 | 500 | 0.4874 ± 0.0228 | 0.6982 ± 0.0608 | 0.4774 ± 0.0239 | 0.6789 ± 0.0521 |
HamNet | 4.0.0 | 400 | 0.5479 ± 0.0143 | 0.7417 ± 0.0298 | 0.5109 ± 0.0112 | 0.7008 ± 0.0241 |
HDNNP2nd | 4.0.0 | 500 | 0.7857 ± 0.0986 | 1.0467 ± 0.1367 | 0.7620 ± 0.1024 | 1.0097 ± 0.1326 |
INorp | 4.0.0 | 500 | 0.5055 ± 0.0436 | 0.7297 ± 0.0786 | 0.4791 ± 0.0348 | 0.6687 ± 0.0520 |
MAT | 4.0.0 | 400 | 0.5064 ± 0.0299 | 0.7194 ± 0.0630 | 0.5035 ± 0.0288 | 0.7125 ± 0.0570 |
MEGAN | 4.0.0 | 400 | 0.4281 ± 0.0201 | 0.6062 ± 0.0252 | 0.4161 ± 0.0139 | 0.5798 ± 0.0201 |
Megnet | 4.0.0 | 800 | 0.5679 ± 0.0310 | 0.8196 ± 0.0480 | 0.5059 ± 0.0258 | 0.7003 ± 0.0454 |
MoGAT | 4.0.0 | 200 | 0.4797 ± 0.0114 | 0.6533 ± 0.0114 | 0.4613 ± 0.0135 | 0.6247 ± 0.0161 |
MXMNet | 4.0.0 | 900 | 0.6486 ± 0.0633 | 1.0123 ± 0.2059 | 0.6008 ± 0.0575 | 0.8923 ± 0.1685 |
NMPN | 4.0.0 | 800 | 0.5046 ± 0.0266 | 0.7193 ± 0.0607 | 0.4823 ± 0.0226 | 0.6729 ± 0.0521 |
PAiNN | 4.0.0 | 250 | 0.4857 ± 0.0598 | 0.6650 ± 0.0674 | 0.4206 ± 0.0157 | 0.5925 ± 0.0476 |
RGCN | 4.0.0 | 800 | 0.4703 ± 0.0251 | 0.6529 ± 0.0318 | 0.4387 ± 0.0178 | 0.6048 ± 0.0240 |
rGIN | 4.0.0 | 300 | 0.5196 ± 0.0351 | 0.7142 ± 0.0263 | 0.4956 ± 0.0292 | 0.6887 ± 0.0231 |
Schnet | 4.0.0 | 800 | 0.4777 ± 0.0294 | 0.6977 ± 0.0538 | 0.4503 ± 0.0243 | 0.6416 ± 0.0434 |
FreeSolv (MoleculeNet) consists of 642 compounds as smiles and their corresponding hydration free energy for small neutral molecules in water. We use a random 5-fold cross-validation.
model | kgcnn | epochs | MAE [log mol/L] | RMSE [log mol/L] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
CMPNN | 4.0.0 | 600 | 0.5202 ± 0.0504 | 0.9339 ± 0.1286 | 0.5016 ± 0.0551 | 0.8886 ± 0.1249 |
DGIN | 4.0.0 | 300 | 0.5489 ± 0.0374 | 0.9448 ± 0.0787 | 0.5132 ± 0.0452 | 0.8704 ± 0.1177 |
DimeNetPP | 4.0.0 | 872 | 0.6167 ± 0.0719 | 1.0302 ± 0.1717 | 0.5907 ± 0.0663 | 0.9580 ± 0.1503 |
DMPNN | 4.0.0 | 300 | 0.5487 ± 0.0754 | 0.9206 ± 0.1889 | 0.4947 ± 0.0665 | 0.8362 ± 0.1812 |
EGNN | 4.0.0 | 800 | 0.5386 ± 0.0548 | 1.0363 ± 0.1237 | 0.5268 ± 0.0607 | 0.9849 ± 0.1590 |
GAT | 4.0.0 | 500 | 0.6051 ± 0.0861 | 1.0326 ± 0.1819 | 0.5790 ± 0.0880 | 0.9717 ± 0.2008 |
GATv2 | 4.0.0 | 500 | 0.6151 ± 0.0247 | 1.0535 ± 0.0817 | 0.5971 ± 0.0177 | 1.0037 ± 0.0753 |
GCN | 4.0.0 | 800 | 0.6400 ± 0.0834 | 1.0876 ± 0.1393 | 0.5780 ± 0.0836 | 0.9438 ± 0.1597 |
GIN | 4.0.0 | 300 | 0.8100 ± 0.1016 | 1.2695 ± 0.1192 | 0.6720 ± 0.0516 | 1.0699 ± 0.0662 |
GNNFilm | 4.0.0 | 800 | 0.6562 ± 0.0552 | 1.1597 ± 0.1245 | 0.6358 ± 0.0606 | 1.1168 ± 0.1371 |
GraphSAGE | 4.0.0 | 500 | 0.5894 ± 0.0675 | 1.0009 ± 0.1491 | 0.5700 ± 0.0615 | 0.9508 ± 0.1333 |
HamNet | 4.0.0 | 400 | 0.6619 ± 0.0428 | 1.1410 ± 0.1120 | 0.6005 ± 0.0466 | 1.0120 ± 0.0800 |
HDNNP2nd | 4.0.0 | 500 | 1.0201 ± 0.1559 | 1.6351 ± 0.3419 | 0.9933 ± 0.1523 | 1.5395 ± 0.2969 |
INorp | 4.0.0 | 500 | 0.6612 ± 0.0188 | 1.1155 ± 0.1061 | 0.6391 ± 0.0154 | 1.0556 ± 0.1064 |
MAT | 4.0.0 | 400 | 0.8115 ± 0.0649 | 1.3099 ± 0.1235 | 0.7915 ± 0.0687 | 1.2256 ± 0.1712 |
MEGAN | 4.0.0 | 400 | 0.6303 ± 0.0550 | 1.0429 ± 0.1031 | 0.6141 ± 0.0540 | 1.0192 ± 0.1074 |
Megnet | 4.0.0 | 800 | 0.8878 ± 0.0528 | 1.4134 ± 0.1200 | 0.8090 ± 0.0405 | 1.2735 ± 0.1157 |
MoGAT | 4.0.0 | 200 | 0.7097 ± 0.0374 | 1.0911 ± 0.1334 | 0.6596 ± 0.0450 | 1.0424 ± 0.1313 |
MXMNet | 4.0.0 | 900 | 1.1386 ± 0.1979 | 3.0487 ± 2.1757 | 1.0970 ± 0.1909 | 2.8598 ± 2.0855 |
RGCN | 4.0.0 | 800 | 0.5128 ± 0.0810 | 0.9228 ± 0.1887 | 0.4956 ± 0.0864 | 0.8678 ± 0.2111 |
rGIN | 4.0.0 | 300 | 0.8503 ± 0.0613 | 1.3285 ± 0.0976 | 0.8042 ± 0.0777 | 1.2469 ± 0.1013 |
Schnet | 4.0.0 | 800 | 0.6070 ± 0.0285 | 1.0603 ± 0.0549 | 0.5688 ± 0.0314 | 0.9526 ± 0.0840 |
The database consist of 129 molecules each containing 5,000 conformational geometries, energies and forces with a resolution of 1 femtosecond in the molecular dynamics trajectories. The molecules were randomly drawn from the largest set of isomers in the QM9 dataset.
model | kgcnn | epochs | Energy (test_within) | Force (test_within) | Min. Energy (test_within) | Min. Force (test_within) |
---|---|---|---|---|---|---|
Schnet.EnergyForceModel | 4.0.0 | 1000 | 0.0061 ± nan | 0.0134 ± nan | 0.0057 ± nan | 0.0134 ± nan |
Lipophilicity (MoleculeNet) consists of 4200 compounds as smiles. Graph labels for regression are octanol/water distribution coefficient (logD at pH 7.4). We use random 5-fold cross-validation.
model | kgcnn | epochs | MAE [log mol/L] | RMSE [log mol/L] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
DMPNN | 4.0.0 | 300 | 0.3814 ± 0.0064 | 0.5462 ± 0.0095 | 0.3774 ± 0.0072 | 0.5421 ± 0.0093 |
GAT | 4.0.0 | 500 | 0.5168 ± 0.0088 | 0.7220 ± 0.0098 | 0.4906 ± 0.0092 | 0.6819 ± 0.0079 |
GATv2 | 4.0.0 | 500 | 0.4342 ± 0.0104 | 0.6056 ± 0.0114 | 0.4163 ± 0.0089 | 0.5785 ± 0.0163 |
GCN | 4.0.0 | 800 | 0.4960 ± 0.0107 | 0.6833 ± 0.0155 | 0.4729 ± 0.0126 | 0.6496 ± 0.0116 |
GIN | 4.0.0 | 300 | 0.4745 ± 0.0101 | 0.6658 ± 0.0159 | 0.4703 ± 0.0089 | 0.6555 ± 0.0163 |
GraphSAGE | 4.0.0 | 500 | 0.4333 ± 0.0217 | 0.6218 ± 0.0318 | 0.4296 ± 0.0175 | 0.6108 ± 0.0258 |
Schnet | 4.0.0 | 800 | 0.5657 ± 0.0202 | 0.7485 ± 0.0245 | 0.5280 ± 0.0136 | 0.7024 ± 0.0210 |
Energies and forces for molecular dynamics trajectories of eight organic molecules. All geometries in A, energy labels in kcal/mol and force labels in kcal/mol/A. We use preset train-test split. Training on 1000 geometries, test on 500/1000 geometries. Errors are MAE for forces. Results are for the CCSD and CCSD(T) data in MD17.
model | kgcnn | epochs | Aspirin | Toluene | Malonaldehyde | Benzene | Ethanol |
---|---|---|---|---|---|---|---|
PAiNN.EnergyForceModel | 4.0.0 | 1000 | 0.8551 ± nan | 0.2815 ± nan | 0.7749 ± nan | 0.0427 ± nan | 0.5805 ± nan |
Schnet.EnergyForceModel | 4.0.0 | 1000 | 1.2173 ± nan | 0.7395 ± nan | 0.8444 ± nan | 0.3353 ± nan | 0.4832 ± nan |
Energies and forces for molecular dynamics trajectories. All geometries in A, energy labels in kcal/mol and force labels in kcal/mol/A. We use preset train-test split. Training on 1000 geometries, test on 500/1000 geometries. Errors are MAE for forces.
model | kgcnn | epochs | Aspirin | Toluene | Malonaldehyde | Benzene | Ethanol |
---|---|---|---|---|---|---|---|
Schnet.EnergyForceModel | 4.0.0 | 1000 | 1.0389 ± 0.0071 | 0.5482 ± 0.0105 | 0.6727 ± 0.0132 | 0.2525 ± 0.0091 | 0.4471 ± 0.0199 |
Materials Project dataset from Matbench with 4764 crystal structures and their corresponding Refractive index (unitless). We use a random 5-fold cross-validation.
model | kgcnn | epochs | MAE [no unit] | RMSE [no unit] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
CGCNN.make_crystal_model | 4.0.0 | 1000 | 0.3306 ± 0.0602 | 1.9736 ± 0.7324 | 0.3012 ± 0.0561 | 1.7712 ± 0.6468 |
DimeNetPP.make_crystal_model | 4.0.0 | 780 | 0.3415 ± 0.0542 | 1.9637 ± 0.6323 | 0.3031 ± 0.0526 | 1.7761 ± 0.6535 |
Megnet.make_crystal_model | 4.0.0 | 1000 | 0.3362 ± 0.0550 | 2.0156 ± 0.5872 | 0.3007 ± 0.0563 | 1.7416 ± 0.6413 |
NMPN.make_crystal_model | 4.0.0 | 700 | 0.3289 ± 0.0489 | 1.8770 ± 0.6522 | 0.3037 ± 0.0485 | 1.7718 ± 0.6470 |
PAiNN.make_crystal_model | 4.0.0 | 800 | 0.3539 ± 0.0433 | 1.8661 ± 0.5984 | 0.3063 ± 0.0481 | 1.7823 ± 0.6299 |
Schnet.make_crystal_model | 4.0.0 | 800 | 0.3180 ± 0.0359 | 1.8509 ± 0.5854 | 0.2914 ± 0.0475 | 1.7244 ± 0.6188 |
Materials Project dataset from Matbench with 132752 crystal structures and their corresponding formation energy in [eV/atom]. We use a random 5-fold cross-validation.
model | kgcnn | epochs | MAE [eV/atom] | RMSE [eV/atom] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
CGCNN.make_crystal_model | 4.0.0 | 1000 | 0.0298 ± 0.0002 | 0.0747 ± 0.0029 | 0.0298 ± 0.0002 | 0.0738 ± 0.0029 |
Schnet.make_crystal_model | 4.0.0 | 800 | 0.0211 ± 0.0003 | 0.0510 ± 0.0024 | 0.0211 ± 0.0003 | 0.0505 ± 0.0023 |
Materials Project dataset from Matbench with 106113 crystal structures and their band gap as calculated by PBE DFT from the Materials Project, in eV. We use a random 5-fold cross-validation.
model | kgcnn | epochs | MAE [eV] | RMSE [eV] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
CGCNN.make_crystal_model | 4.0.0 | 1000 | 0.2039 ± 0.0050 | 0.4882 ± 0.0213 | 0.2039 ± 0.0050 | 0.4783 ± 0.0203 |
Schnet.make_crystal_model | 4.0.0 | 800 | 1.2226 ± 1.0573 | 58.3713 ± 114.2957 | 0.2983 ± 0.0257 | 0.6192 ± 0.0409 |
Materials Project dataset from Matbench with 106113 crystal structures and their corresponding Metallicity determined with pymatgen. 1 if the compound is a metal, 0 if the compound is not a metal. We use a random 5-fold cross-validation.
model | kgcnn | epochs | Accuracy | AUC | Max. Accuracy | Max. AUC |
---|---|---|---|---|---|---|
CGCNN.make_crystal_model | 4.0.0 | 100 | 0.8910 ± 0.0027 | 0.9406 ± 0.0024 | 0.8954 ± 0.0028 | nan ± nan |
Megnet.make_crystal_model | 4.0.0 | 100 | 0.8966 ± 0.0033 | 0.9506 ± 0.0026 | 0.8995 ± 0.0027 | nan ± nan |
Schnet.make_crystal_model | 4.0.0 | 80 | 0.8953 ± 0.0058 | 0.9506 ± 0.0053 | 0.9005 ± 0.0027 | nan ± nan |
Materials Project dataset from Matbench with 636 crystal structures and their corresponding Exfoliation energy (meV/atom). We use a random 5-fold cross-validation.
model | kgcnn | epochs | MAE [meV/atom] | RMSE [meV/atom] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
CGCNN.make_crystal_model | 4.0.0 | 1000 | 57.6974 ± 18.0803 | 140.6167 ± 44.8418 | 46.6901 ± 13.5301 | 121.0725 ± 44.0067 |
DimeNetPP.make_crystal_model | 4.0.0 | 780 | 50.2880 ± 11.4199 | 126.0600 ± 38.3769 | 46.1936 ± 11.8615 | 118.6555 ± 38.6340 |
Megnet.make_crystal_model | 4.0.0 | 1000 | 51.1735 ± 9.1746 | 123.4178 ± 32.9582 | 45.2357 ± 10.1934 | 113.8528 ± 37.2491 |
NMPN.make_crystal_model | 4.0.0 | 700 | 59.3986 ± 10.9272 | 139.5943 ± 32.1129 | 48.0720 ± 12.1130 | 120.6016 ± 39.6981 |
PAiNN.make_crystal_model | 4.0.0 | 800 | 49.3889 ± 11.5376 | 121.7087 ± 30.0472 | 46.6649 ± 11.5589 | 117.9086 ± 32.8603 |
Schnet.make_crystal_model | 4.0.0 | 800 | 45.2412 ± 11.6395 | 115.6890 ± 39.0929 | 41.4056 ± 10.7214 | 112.5666 ± 38.0183 |
Materials Project dataset from Matbench with 10987 crystal structures and their corresponding Base 10 logarithm of the DFT Voigt-Reuss-Hill average shear moduli in GPa. We use a random 5-fold cross-validation.
model | kgcnn | epochs | MAE [log(GPa)] | RMSE [log(GPa)] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
CGCNN.make_crystal_model | 4.0.0 | 1000 | 0.0874 ± 0.0022 | 0.1354 ± 0.0056 | 0.0870 ± 0.0018 | 0.1316 ± 0.0041 |
DimeNetPP.make_crystal_model | 4.0.0 | 780 | 0.0839 ± 0.0027 | 0.1290 ± 0.0065 | 0.0809 ± 0.0024 | 0.1232 ± 0.0049 |
Megnet.make_crystal_model | 4.0.0 | 1000 | 0.0885 ± 0.0017 | 0.1360 ± 0.0054 | 0.0883 ± 0.0016 | 0.1342 ± 0.0049 |
NMPN.make_crystal_model | 4.0.0 | 700 | 0.0874 ± 0.0027 | 0.1324 ± 0.0045 | 0.0867 ± 0.0025 | 0.1310 ± 0.0040 |
PAiNN.make_crystal_model | 4.0.0 | 800 | 0.0870 ± 0.0033 | 0.1332 ± 0.0103 | 0.0845 ± 0.0017 | 0.1254 ± 0.0046 |
Schnet.make_crystal_model | 4.0.0 | 800 | 0.0836 ± 0.0021 | 0.1296 ± 0.0044 | 0.0828 ± 0.0020 | 0.1277 ± 0.0043 |
Materials Project dataset from Matbench with 10987 crystal structures and their corresponding Base 10 logarithm of the DFT Voigt-Reuss-Hill average bulk moduli in GPa. We use a random 5-fold cross-validation.
model | kgcnn | epochs | MAE [log(GPa)] | RMSE [log(GPa)] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
CGCNN.make_crystal_model | 4.0.0 | 1000 | 0.0672 ± 0.0012 | 0.1265 ± 0.0042 | 0.0646 ± 0.0007 | 0.1199 ± 0.0036 |
DimeNetPP.make_crystal_model | 4.0.0 | 780 | 0.0604 ± 0.0023 | 0.1141 ± 0.0055 | 0.0588 ± 0.0019 | 0.1095 ± 0.0057 |
Megnet.make_crystal_model | 4.0.0 | 1000 | 0.0686 ± 0.0016 | 0.1285 ± 0.0061 | 0.0675 ± 0.0013 | 0.1264 ± 0.0052 |
NMPN.make_crystal_model | 4.0.0 | 700 | 0.0688 ± 0.0009 | 0.1262 ± 0.0031 | 0.0647 ± 0.0015 | 0.1189 ± 0.0042 |
PAiNN.make_crystal_model | 4.0.0 | 800 | 0.0649 ± 0.0007 | 0.1170 ± 0.0048 | 0.0565 ± 0.0009 | 0.1080 ± 0.0045 |
Schnet.make_crystal_model | 4.0.0 | 800 | 0.0635 ± 0.0016 | 0.1186 ± 0.0044 | 0.0629 ± 0.0013 | 0.1154 ± 0.0046 |
Materials Project dataset from Matbench with 18928 crystal structures and their corresponding Heat of formation of the entire 5-atom perovskite cell in eV. We use a random 5-fold cross-validation.
model | kgcnn | epochs | MAE [eV] | RMSE [eV] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
CGCNN.make_crystal_model | 4.0.0 | 1000 | 0.0425 ± 0.0011 | 0.0712 ± 0.0037 | 0.0422 ± 0.0015 | 0.0684 ± 0.0030 |
DimeNetPP.make_crystal_model | 4.0.0 | 780 | 0.0447 ± 0.0016 | 0.0730 ± 0.0050 | 0.0415 ± 0.0015 | 0.0690 ± 0.0045 |
Megnet.make_crystal_model | 4.0.0 | 1000 | 0.0388 ± 0.0017 | 0.0675 ± 0.0041 | 0.0388 ± 0.0017 | 0.0675 ± 0.0041 |
NMPN.make_crystal_model | 4.0.0 | 700 | 0.0381 ± 0.0009 | 0.0652 ± 0.0029 | 0.0380 ± 0.0009 | 0.0649 ± 0.0029 |
PAiNN.make_crystal_model | 4.0.0 | 800 | 0.0474 ± 0.0003 | 0.0762 ± 0.0017 | 0.0472 ± 0.0004 | 0.0759 ± 0.0017 |
Schnet.make_crystal_model | 4.0.0 | 800 | 0.0381 ± 0.0005 | 0.0645 ± 0.0024 | 0.0380 ± 0.0005 | 0.0644 ± 0.0022 |
Materials Project dataset from Matbench with 1,265 crystal structures and their corresponding vibration properties in [1/cm]. We use a random 5-fold cross-validation.
model | kgcnn | epochs | MAE [eV/atom] | RMSE [eV/atom] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
CGCNN.make_crystal_model | 4.0.0 | 1000 | 42.6447 ± 4.5721 | 92.1627 ± 21.4345 | 41.3049 ± 3.8502 | 86.9412 ± 16.6723 |
DimeNetPP.make_crystal_model | 4.0.0 | 780 | 39.8893 ± 3.1280 | 77.5776 ± 16.0908 | 36.1806 ± 2.1331 | 67.9898 ± 7.9298 |
Megnet.make_crystal_model | 4.0.0 | 1000 | 30.6620 ± 2.9013 | 60.8733 ± 17.1448 | 28.9268 ± 3.0908 | 54.5838 ± 13.5562 |
NMPN.make_crystal_model | 4.0.0 | 700 | 45.9344 ± 5.7908 | 95.4136 ± 35.5401 | 43.0340 ± 4.1057 | 79.5178 ± 28.0048 |
PAiNN.make_crystal_model | 4.0.0 | 800 | 47.5408 ± 4.2815 | 86.6761 ± 11.9220 | 45.9714 ± 3.3346 | 79.7746 ± 8.6082 |
Schnet.make_crystal_model | 4.0.0 | 800 | 43.0692 ± 3.6227 | 88.5151 ± 20.0244 | 41.8227 ± 3.4578 | 76.7519 ± 16.4611 |
MUTAG dataset from TUDataset for classification with 188 graphs. We use random 5-fold cross-validation.
model | kgcnn | epochs | Accuracy | AUC(ROC) | Max. Accuracy | Max. AUC |
---|---|---|---|---|---|---|
DMPNN | 4.0.0 | 300 | 0.8407 ± 0.0463 | 0.8567 ± 0.0511 | 0.9098 ± 0.0390 | 0.9564 ± 0.0243 |
GAT | 4.0.0 | 500 | 0.8141 ± 0.1077 | 0.8671 ± 0.0923 | 0.8407 ± 0.0926 | 0.9402 ± 0.0542 |
GATv2 | 4.0.0 | 500 | 0.8193 ± 0.0945 | 0.8379 ± 0.1074 | 0.8248 ± 0.0976 | 0.9360 ± 0.0512 |
GCN | 4.0.0 | 800 | 0.7716 ± 0.0531 | 0.7956 ± 0.0909 | 0.8673 ± 0.0573 | 0.9324 ± 0.0544 |
GIN | 4.0.0 | 300 | 0.8091 ± 0.0781 | 0.8693 ± 0.0855 | 0.9100 ± 0.0587 | 0.9539 ± 0.0564 |
GraphSAGE | 4.0.0 | 500 | 0.8357 ± 0.0798 | 0.8533 ± 0.0824 | 0.8886 ± 0.0710 | 0.8957 ± 0.0814 |
Mutagenicity dataset from TUDataset for classification with 4337 graphs. The dataset was cleaned for unconnected atoms. We use random 5-fold cross-validation.
model | kgcnn | epochs | Accuracy | AUC(ROC) | Max. Accuracy | Max. AUC |
---|---|---|---|---|---|---|
DMPNN | 4.0.0 | 300 | 0.8266 ± 0.0059 | 0.8708 ± 0.0076 | 0.8423 ± 0.0073 | 0.8968 ± 0.0109 |
GAT | 4.0.0 | 500 | 0.7989 ± 0.0114 | 0.8290 ± 0.0112 | 0.8119 ± 0.0049 | 0.8700 ± 0.0077 |
GATv2 | 4.0.0 | 200 | 0.7674 ± 0.0048 | 0.8423 ± 0.0064 | 0.7743 ± 0.0079 | 0.8426 ± 0.0062 |
GCN | 4.0.0 | 800 | 0.7955 ± 0.0154 | 0.8191 ± 0.0137 | 0.8130 ± 0.0090 | 0.8670 ± 0.0068 |
GIN | 4.0.0 | 300 | 0.8118 ± 0.0091 | 0.8492 ± 0.0077 | 0.8248 ± 0.0089 | 0.8798 ± 0.0026 |
GraphSAGE | 4.0.0 | 500 | 0.8195 ± 0.0126 | 0.8515 ± 0.0083 | 0.8294 ± 0.0123 | 0.8851 ± 0.0061 |
TUDataset of proteins that are classified as enzymes or non-enzymes. Nodes represent the amino acids of the protein. We use random 5-fold cross-validation.
model | kgcnn | epochs | Accuracy | AUC(ROC) | Max. Accuracy | Max. AUC |
---|---|---|---|---|---|---|
DMPNN | 4.0.0 | 300 | 0.7287 ± 0.0253 | 0.7970 ± 0.0343 | 0.7790 ± 0.0190 | 0.8298 ± 0.0329 |
GAT | 4.0.0 | 500 | 0.7314 ± 0.0357 | 0.7899 ± 0.0468 | 0.7763 ± 0.0380 | 0.8269 ± 0.0367 |
GATv2 | 4.0.0 | 500 | 0.6720 ± 0.0595 | 0.6850 ± 0.0938 | 0.7898 ± 0.0272 | 0.8273 ± 0.0304 |
GCN | 4.0.0 | 800 | 0.7017 ± 0.0303 | 0.7211 ± 0.0254 | 0.7790 ± 0.0301 | 0.8342 ± 0.0358 |
GIN | 4.0.0 | 150 | 0.7224 ± 0.0343 | 0.7905 ± 0.0528 | 0.7700 ± 0.0299 | 0.8096 ± 0.0409 |
GraphSAGE | 4.0.0 | 500 | 0.7009 ± 0.0398 | 0.7263 ± 0.0453 | 0.7691 ± 0.0369 | 0.7991 ± 0.0353 |
QM7 dataset is a subset of GDB-13. Molecules of up to 23 atoms (including 7 heavy atoms C, N, O, and S), totalling 7165 molecules. We use dataset-specific 5-fold cross-validation. The atomization energies are given in kcal/mol and are ranging from -800 to -2000 kcal/mol).
model | kgcnn | epochs | MAE [kcal/mol] | RMSE [kcal/mol] | Min. MAE | Min. RMSE |
---|---|---|---|---|---|---|
DimeNetPP | 4.0.0 | 872 | 3.4639 ± 0.2003 | 7.5327 ± 1.8190 | 3.4575 ± 0.1917 | 7.4462 ± 1.7268 |
EGNN | 4.0.0 | 800 | 1.7300 ± 0.1336 | 5.1268 ± 2.5134 | 1.7022 ± 0.1210 | 5.0965 ± 2.4826 |
Megnet | 4.0.0 | 800 | 1.5180 ± 0.0802 | 3.0321 ± 0.1936 | 1.5148 ± 0.0805 | 2.9391 ± 0.1885 |
MXMNet | 4.0.0 | 900 | 1.2431 ± 0.0820 | 2.6694 ± 0.2584 | 1.1588 ± 0.0840 | 2.6014 ± 0.2272 |
NMPN | 4.0.0 | 500 | 7.2907 ± 0.9061 | 38.1446 ± 12.1445 | 7.2489 ± 0.8699 | 35.4767 ± 10.2318 |
PAiNN | 4.0.0 | 872 | 1.5765 ± 0.0742 | 5.2705 ± 2.2848 | 1.5428 ± 0.0675 | 5.1099 ± 2.0842 |
Schnet | 4.0.0 | 800 | 3.4313 ± 0.4757 | 10.8978 ± 7.3863 | 3.3606 ± 0.4927 | 9.8169 ± 6.3053 |
QM9 dataset of 134k stable small organic molecules made up of C,H,O,N,F. Labels include geometric, energetic, electronic, and thermodynamic properties. We use a random 5-fold cross-validation, but not all splits are evaluated for cheaper evaluation. Test errors are MAE and for energies are given in [eV].
model | kgcnn | epochs | HOMO [eV] | LUMO [eV] | U0 [eV] | H [eV] | G [eV] |
---|---|---|---|---|---|---|---|
Megnet | 4.0.0 | 800 | nan ± nan | 0.0407 ± 0.0009 | nan ± nan | nan ± nan | 0.0169 ± 0.0006 |
PAiNN | 4.0.0 | 872 | 0.0483 ± 0.0275 | 0.0268 ± 0.0002 | 0.0099 ± 0.0003 | 0.0101 ± 0.0003 | 0.0110 ± 0.0002 |
Schnet | 4.0.0 | 800 | 0.0402 ± 0.0007 | 0.0340 ± 0.0001 | 0.0142 ± 0.0002 | 0.0146 ± 0.0002 | 0.0143 ± 0.0002 |
SIDER (MoleculeNet) consists of 1427 compounds as smiles and data for adverse drug reactions (ADR), grouped into 27 system organ classes. We use random 5-fold cross-validation.
model | kgcnn | epochs | Accuracy | AUC(ROC) | Max. Accuracy | Max. AUC |
---|---|---|---|---|---|---|
DMPNN | 4.0.0 | 50 | 0.7519 ± 0.0055 | 0.6280 ± 0.0173 | 0.7629 ± 0.0041 | 0.6336 ± 0.0167 |
GAT | 4.0.0 | 50 | 0.7595 ± 0.0034 | 0.6224 ± 0.0106 | 0.7616 ± 0.0015 | 0.6231 ± 0.0101 |
GATv2 | 4.0.0 | 50 | 0.7548 ± 0.0052 | 0.6152 ± 0.0154 | 0.7602 ± 0.0036 | 0.6201 ± 0.0169 |
GIN | 4.0.0 | 50 | 0.7472 ± 0.0055 | 0.5995 ± 0.0058 | 0.7565 ± 0.0032 | 0.6106 ± 0.0085 |
GraphSAGE | 4.0.0 | 30 | 0.7547 ± 0.0043 | 0.6038 ± 0.0108 | 0.7597 ± 0.0021 | 0.6109 ± 0.0107 |
Schnet | 4.0.0 | 50 | 0.7583 ± 0.0076 | 0.6119 ± 0.0159 | 0.7623 ± 0.0072 | 0.6191 ± 0.0105 |
Tox21 (MoleculeNet) consists of 7831 compounds as smiles and 12 different targets relevant to drug toxicity. We use random 5-fold cross-validation.
model | kgcnn | epochs | Accuracy | AUC(ROC) | BACC | Max. BACC | Max. Accuracy | Max. AUC |
---|---|---|---|---|---|---|---|---|
DMPNN | 4.0.0 | 50 | 0.9272 ± 0.0024 | 0.8321 ± 0.0103 | 0.6995 ± 0.0130 | 0.7123 ± 0.0142 | 0.9292 ± 0.0016 | 0.8417 ± 0.0075 |
GAT | 4.0.0 | 50 | 0.9243 ± 0.0022 | 0.8279 ± 0.0092 | 0.6504 ± 0.0074 | 0.6528 ± 0.0071 | 0.9246 ± 0.0021 | 0.8293 ± 0.0093 |
GATv2 | 4.0.0 | 50 | 0.9222 ± 0.0019 | 0.8251 ± 0.0069 | 0.6760 ± 0.0140 | 0.6782 ± 0.0156 | 0.9246 ± 0.0025 | 0.8314 ± 0.0116 |
GIN | 4.0.0 | 50 | 0.9220 ± 0.0024 | 0.7986 ± 0.0180 | 0.6741 ± 0.0143 | 0.6882 ± 0.0151 | 0.9259 ± 0.0022 | 0.8248 ± 0.0130 |
GraphSAGE | 4.0.0 | 100 | 0.9180 ± 0.0027 | 0.7976 ± 0.0087 | 0.6755 ± 0.0047 | 0.7083 ± 0.0114 | 0.9252 ± 0.0015 | 0.8225 ± 0.0142 |
Schnet | 4.0.0 | 50 | 0.9128 ± 0.0030 | 0.7719 ± 0.0139 | 0.6639 ± 0.0162 | 0.6820 ± 0.0115 | 0.9215 ± 0.0027 | 0.7980 ± 0.0079 |