Language-independent link key-based data interlinking - Archive ouverte HAL Accéder directement au contenu
Rapport (Rapport Contrat/Projet) Année : 2015

Language-independent link key-based data interlinking

Résumé

Links are important for the publication of RDF data on the web. Yet, establishing links between data sets is not an easy task. We develop an approach for that purpose which extracts weak link keys. Link keys extend the notion of a key to the case of different data sets. They are made of a set of pairs of properties belonging to two different classes. A weak link key holds between two classes if any resources having common values for all of these properties are the same resources. An algorithm is proposed to generate a small set of candidate link keys. Depending on whether some of the, valid or invalid, links are known, we define supervised and non supervised measures for selecting the appropriate link keys. The supervised measures approximate precision and recall, while the non supervised measures are the ratio of pairs of entities a link key covers (coverage), and the ratio of entities from the same data set it identifies (discrimination). We have experimented these techniques on two data sets, showing the accuracy and robustness of both approaches.
Fichier principal
Vignette du fichier
lindicle-41.pdf (440.56 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01180925 , version 1 (28-07-2015)

Identifiants

  • HAL Id : hal-01180925 , version 1

Citer

Jérôme David, Jérôme Euzenat, Manuel Atencia. Language-independent link key-based data interlinking. [Contract] Lindicle. 2015, pp.21. ⟨hal-01180925⟩
189 Consultations
151 Téléchargements

Partager

Gmail Facebook X LinkedIn More