Constructing and Cleaning Identity Graphs in the LOD Cloud

Joe Raad; Wouter Beek; Frank van Harmelen; Jan Wielemaker; Nathalie Pernelle; Fatiha Saïs

doi:10.1162/dint_a_00057

Article Dans Une Revue International Journal of Big Data Intelligence Année : 2020

Constructing and Cleaning Identity Graphs in the LOD Cloud

, , , , , (1, 2)

1
2

Joe Raad

Fonction : Auteur
PersonId : 12563
IdHAL : joe-raad
ORCID : 0000-0002-7891-7738

Wouter Beek

Fonction : Auteur

Frank van Harmelen

Fonction : Auteur

Jan Wielemaker

Fonction : Auteur

Nathalie Pernelle

Fonction : Auteur
PersonId : 738940
IdHAL : pernelle

Fatiha Saïs

Fonction : Auteur
PersonId : 2805
IdHAL : fatihasais
ORCID : 0000-0002-6995-2785
IdRef : 124298036

Données et Connaissances Massives et Hétérogènes (LRI)

Laboratoire de Recherche en Informatique

Résumé

In the absence of a central naming authority on the Semantic Web, it is common for different data sets to refer to the same thing by different names. Whenever multiple names are used to denote the same thing, owl:sameAs statements are needed in order to link the data and foster reuse. Studies that date back as far as 2009, observed that the owl:sameAs property is sometimes used incorrectly. In our previous work, we presented an identity graph containing over 500 million explicit and 35 billion implied owl:sameAs statements, and presented a scalable approach for automatically calculating an error degree for each identity statement. In this paper, we generate subgraphs of the overall identity graph that correspond to certain error degrees. We show that even though the Semantic Web contains many erroneous owl:sameAs statements, it is still possible to use Semantic Web data while at the same time minimising the adverse effects of misusing owl:sameAs.

Domaines

Informatique [cs] Intelligence artificielle [cs.AI] Web Base de données [cs.DB]

Fatiha Saïs : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03211687

Soumis le : mercredi 28 avril 2021-23:17:57

Dernière modification le : lundi 5 février 2024-14:18:05

Dates et versions

hal-03211687 , version 1 (28-04-2021)

Identifiants

HAL Id : hal-03211687 , version 1
DOI : 10.1162/dint_a_00057

Citer

Joe Raad, Wouter Beek, Frank van Harmelen, Jan Wielemaker, Nathalie Pernelle, et al.. Constructing and Cleaning Identity Graphs in the LOD Cloud. International Journal of Big Data Intelligence, 2020, 2 (3), pp.323-352. ⟨10.1162/dint_a_00057⟩. ⟨hal-03211687⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UMR8623 CENTRALESUPELEC LRI-LAHDAK UNIV-PARIS-SACLAY LISN GS-ENGINEERING GS-COMPUTER-SCIENCE GS-LIFE-SCIENCES-HEALTH LISN-LAHDAK

49 Consultations

0 Téléchargements

Constructing and Cleaning Identity Graphs in the LOD Cloud

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager