Detecting Erroneous Identity Links on the Web using Network Metrics

Abstract : In the absence of a central naming authority on the Semantic Web, it is common for different datasets to refer to the same thing by different IRIs. Whenever multiple names are used to denote the same thing, owl:sameAs statements are needed in order to link the data and foster reuse. Studies that date back as far as 2009, have observed that the owl:sameAs property is sometimes used incorrectly. In this paper, we show how network metrics such as the community structure of the owl:sameAs graph can be used in order to detect such possibly erroneous statements. One benefit of the here presented approach is that it can be applied to the network of owl:sameAs links itself, and does not rely on any additional knowledge. In order to illustrate its ability to scale, the approach is evaluated on the largest collection of identity links to date, containing over 558M owl:sameAs links scraped from the LOD Cloud.
Complete list of metadatas

Cited literature [24 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01899407
Contributor : Joe Raad <>
Submitted on : Friday, October 19, 2018 - 2:32:41 PM
Last modification on : Thursday, February 20, 2020 - 7:22:39 PM
Long-term archiving on: Sunday, January 20, 2019 - 3:08:38 PM

File

ISWC 2018.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01899407, version 1

Citation

Joe Raad, Wouter Beek, Frank van Harmelen, Nathalie Pernelle, Fatiha Saïs. Detecting Erroneous Identity Links on the Web using Network Metrics. The Semantic Web – ISWC 2018. ISWC 2018. Lecture Notes in Computer Science, vol 11136. Springer, Cham, 2018. ⟨hal-01899407⟩

Share

Metrics

Record views

111

Files downloads

255