Linked Data Prototype - Data

The following full-stack has been implemented and is available in this repository:

The full report for this project is available here (Document in French).
The project was done by the Data Semantics Lab of the HES-SO Valais/Wallis - Institute of Informatics - Sierre

Find here under explanations about:

Original data and data transformation
Data validation
Data stored in two triple stores and client interrogation on the SPARQL endpoints

The proxy to secure the data access is described and available in this repository

Data transformation

Fictive datasets were created to simulate UPI and EWR informations.

The choosen data model are the SEMIC core vocabularies:

Core Person Vocabulary for the UPI dataset of people's basic informations
Core Criterion and Core Evidence Vocabulary for the EWR dataset of people's principal residences

UPI and EWR data - CSV to RDF transformation

The UPI dataset creation based on the tarql tool
Tarql can be run on Windows or Linux with run.bat and run.sh respectively (found here)
The EWR dataset creation based on tarql
Tarql can be run on Windows or Linux with run.bat and run.sh respectively (found here)

UPI data - XML to RDF transformation

The XML2RDF folder contains example of tools to demonstrate how XML can be transformed to RDF:

The first example is based on sparql-generate
The second example is based on rocketRML
See the documentation of the tool about its installation that might require node.js and npm.

Both examples contain a run.sh to transform the file persons-eCH0044.xml. It is the same XML file in both examples, that contains 5 fictional caracters as a complement to the UPI dataset generated here above.
More information will be given about those tools soon.

Data validation with SHACL

RDF data can be validated with the W3C standard Shapes Constraint Language (SHACL).

The generated UPI and EWR datasets, presented here above, are based and the SEMIC ontologies and can be validated with their provided SHACL files:

UPI dataset based on the Core Person Vocabulary, find the SHACL file here.
EWR dataset based on the Core Criterion and Core Evidence Vocabulary, find the SHACL file here.

To run the SHACL validation we use the Apache Jena implementation (Apache Jena Commands, version 5.1.0).

The shacl folder contains the necessary tools and files to perform the SHACL validation:

The Jena tool unzipped in the apache-jena-5.1.0 sub-folder
The two generated datasets UPI_Personnes_fiction.ttl and EWR_ResidencesPrincipales.ttl
The SHACL files core-person-ap-SHACL.ttl and cccev-ap-SHACL.ttl
Note: the cccev-ap-SHACL.ttl was adapted to cccev-ap-SHACL_corrected.ttl to avoid mixing http and https URLs for the time ontology
runUPI.sh and runEWR.sh to execute the SHACL validation

Data stored in two triple stores and client interrogation on the SPARQL endpoints

The Python POC with SPARQL queries to two local SPARQL endpoints (EWR and UPI) that can be easily launched locally
The Python code relies on rdflib (Code and Documentation)

The POC was run on Ubuntu:

Launch the EWR SPARQL endpoint with startEWR.sh
The endpoint is published on http://localhost:8000/
Launch the UPI SPARQL endpoint with startUPI.sh
The endpoint is published on http://localhost:8001/
Run the client code with 3 parameters
```
python3 serafe_sparql_query.py --queryNumber 5 --ewr_endpoint http://localhost:8000/ --upi_endpoint=http://localhost:8001/
```
Use the parameter queryNumber to choose which query to run:
1 Federated SPARQL
2 Two queries
3 Multiple queries
4 Wikidata dereferencing
5 All

To display the information about the expected parameters:
```
python3 serafe_sparql_query.py
```

Remark: The client code is sending SPARQL queries to the SPARQL end-points passed in parameters, this POC would thus work with any SPARQL end-point (and any triple store) that host the data

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.vscode		.vscode
UC-Serafe		UC-Serafe
XML2RDF		XML2RDF
images		images
shacl		shacl
README.md		README.md
Rapport_HES_SO_VS_Prototyp_Linked_Data_v1.0.pdf		Rapport_HES_SO_VS_Prototyp_Linked_Data_v1.0.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Linked Data Prototype - Data

Data transformation

UPI and EWR data - CSV to RDF transformation

UPI data - XML to RDF transformation

Data validation with SHACL

Data stored in two triple stores and client interrogation on the SPARQL endpoints

About

Releases

Packages

Contributors 2

Languages

swiss/ld-prototype-data

Folders and files

Latest commit

History

Repository files navigation

Linked Data Prototype - Data

Data transformation

UPI and EWR data - CSV to RDF transformation

UPI data - XML to RDF transformation

Data validation with SHACL

Data stored in two triple stores and client interrogation on the SPARQL endpoints

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages