Skip to content

A Python code for enhancing the output of multilingual named entity recognition based on Wikidata relations

License

Notifications You must be signed in to change notification settings

SisonkeBiotik-Africa/Relational-NER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Relational-NER

A Python code for enhancing the output of multilingual named entity recognition based on Wikidata relations

Description

This project proposes a novel approach that enriches the named entity recognition (NER) of a given class of entities through the use of semantic relations in open knowledge graphs, particularly Wikidata. For the sake of our research work, we have chosen drugs as the considered class for the original NER to be enriched using our approach.

Files

The repository provides four source codes implemented in Python 3.9 and four datasets for the assessment of our proposed approach.

  • Source:
  • Drug-NER.py: Algorithm for the named entity recognition of drug items in the titles of scholarly publications
  • RelationBasedNER.py: Algorithm for the enrichment of the output of drug NER with the annotation of drug-related Wikidata items as revealed by the Wikidata Knowledge Graph.
  • Output:
  • dataset.csv: Dataset of the titles of biomedical scholarly publications about drugs as extracted using the ItemSubjector tool.
  • drugNER.txt: Output of the named entity recognition of drugs.
  • drugs.tsv: List of drug names used for drug NER and extracted using SPARQL from the Wikidata Knowledge Graph.
  • relatedNER.txt: Output of our relation-based named entity recognition algorithm.

Dependencies

  • Langdetect 1.0.9
  • Wikibaseintegrator 0.12.0

Team

  • Houcemeddine Turki, Data Engineering and Semantics Research Unit, University of Sfax, Tunisia
  • Dennis Priskorn, Mid Sweden University, Sweden
  • Mohamed Ali Hadj Taieb, Data Engineering and Semantics Research Unit, University of Sfax, Tunisia
  • Mohamed Ben Aouicha, Data Engineering and Semantics Research Unit, University of Sfax, Tunisia
  • Alejandro Piad-Morffis, School of Math and Computer Science, University of Havana, Cuba

About

A Python code for enhancing the output of multilingual named entity recognition based on Wikidata relations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages