Measuring structural similarity between RDF graphs

Pierre Maillot; Carlos Bobed

doi:10.1145/3167132.3167342

Communication Dans Un Congrès Année : 2018

Measuring structural similarity between RDF graphs

Mesurer la similarité structurelle entre les graphes RDF

(1, 2, 3) ,

1
2
3

Pierre Maillot

Fonction : Auteur
PersonId : 750025
IdHAL : pierre-maillot
ORCID : 0000-0002-9814-439X
IdRef : 190578270

Université Paris Descartes - Paris 5

Université Paris Descartes - UFR de Sciences et Techniques des Activités Physiques et Sportives de Paris (STAPS)

Techniques et enjeux du corps

Carlos Bobed

Fonction : Auteur

Résumé

In the latest years, there has been a huge effort to deploy large amounts of data, making it available in the form of RDF data thanks, among others, to the Linked Data initiative. In this context, using shared ontologies has been crucial to gain interoperability, and to be able to integrate and exploit third party datasets. However, using the same ontology does not suffice to successfully query or integrate external data within your own dataset: the actual usage of the vocabulary (e.g., which concepts have instances, which properties are actually populated and how, etc.) is crucial for these tasks. Beingable to compare different RDF graphs at the actual usage level would indeed help in such situations. Unfortunately, the complexity of graph comparison is an obstacle to the scalability of many approaches. In this article, we present our structural similarity measure, designed to compare structural similarity of low-level data between two different RDF graphs according to the patterns they share. To obtain such patterns, we leverage a data mining method (KRIMP) which allows to extract the most descriptive patterns appearing in a transactional database. We adapt this method to the particularities of RDF data, proposing two different conversions for an RDF graph. Once we have the descriptive patterns, we evaluate how much two graphs can compress each other to give a numerical measure depending on the common data structures they share. We have carried out several experiments to show its ability to capture the structural differences of actual vocabulary usage.

Mots clés

Similarity Semantic Web Linked Data Data Mining

Domaines

Base de données [cs.DB] Recherche d'information [cs.IR] Algorithme et structure de données [cs.DS]

Fichier principal

p1960-maillot.pdf (1.18 Mo)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Pierre Maillot : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01940449

Soumis le : jeudi 24 janvier 2019-10:49:19

Dernière modification le : mercredi 10 novembre 2021-14:06:01

Dates et versions

hal-01940449 , version 1 (24-01-2019)

Identifiants

HAL Id : hal-01940449 , version 1
DOI : 10.1145/3167132.3167342

Citer

Pierre Maillot, Carlos Bobed. Measuring structural similarity between RDF graphs. 33rd Annual ACM Symposium on Applied Computing, Apr 2018, Pau, France. pp.1960-1967, ⟨10.1145/3167132.3167342⟩. ⟨hal-01940449⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

USPC

60 Consultations

527 Téléchargements

Measuring structural similarity between RDF graphs

Mesurer la similarité structurelle entre les graphes RDF

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager