Data interlinking through robust linkkey extraction

Manuel Atencia 1 Jérôme David 1 Jérôme Euzenat 1
1 EXMO - Computer mediated exchange of structured knowledge
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : Links are important for the publication of RDF data on the web. Yet, establishing links between data sets is not an easy task. We develop an approach for that purpose which extracts weak linkkeys. Linkkeys extend the notion of a key to the case of different data sets. They are made of a set of pairs of properties belonging to two different classes. A weak linkkey holds between two classes if any resources having common values for all of these properties are the same resources. An algorithm is proposed to generate a small set of candidate linkkeys. Depending on whether some of the, valid or invalid, links are known, we define supervised and non supervised measures for selecting the appropriate linkkeys. The supervised measures approximate precision and recall, while the non supervised measures are the ratio of pairs of entities a linkkey covers (coverage), and the ratio of entities from the same data set it identifies (discrimination). We have experimented these techniques on two data sets, showing the accuracy and robustness of both approaches.
Document type :
Conference papers
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download
Contributor : Jérôme Euzenat <>
Submitted on : Tuesday, July 21, 2015 - 5:51:36 PM
Last modification on : Tuesday, January 28, 2020 - 4:46:02 PM
Long-term archiving on: Thursday, October 22, 2015 - 11:10:54 AM


Files produced by the author(s)





Manuel Atencia, Jérôme David, Jérôme Euzenat. Data interlinking through robust linkkey extraction. 21st european conference on artificial intelligence (ECAI), Aug 2014, Praha, Czech Republic. pp.15-20, ⟨10.3233/978-1-61499-419-0-15⟩. ⟨hal-01179166⟩



Record views


Files downloads