This repository includes the datasets used for the paper "KG-Roar: Interactive Datalog-based Reasoning on Virtual Knowledge Graphs" submitted to VLDB 2023, the 49th International Conference on Very Large Data Bases.
These synthetic datasets have been used for an experimental evaluation of the elapsed time for company control reasoning task with growing graph size. They are both provided in .parquet
and .csv
file formats. CSV files have been compressed with gzip
.
format (csv or parquet)
┣━ synthetic_graphs:
┃ ┗━ size_<n>M_nodes.parquet
┗━ synthetic_targets:
┗━ sampling_from_<n>M_generation_<g>_size_<m>.parquet
synthetic_graphs
are artifially generated graphs that show structural similarities with the real Italian company network; the naming convention specifies the number n
of nodes (in millions) in the graph.
synthetic_targets
are sets of target companies for which we derived controls in the experimental evaluation; the naming convention specifies:
- the number
n
which refers to the graph size (in million of nodes) from where companies have been sampled; - the generation
g
which is used to identify a specific sampling; - the size
m
of the sampled companies.
- Luigi Bellomarini, Bank of Italy
- Marco Benedetti, Bank of Italy
- Andrea Gentili, Bank of Italy
- Davide Magnanimi, Bank of Italy & Politecnico di Milano
- Emanuel Sallinger, TU Wien & University of Oxford
This work is licensed under a Creative Commons Attribution 4.0 International License.