Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, EpiSciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Conference papers

Fuzzy Semantic Matching in (Semi-)Structured XML Documents : Indexation of Noisy Documents

Arnaud Renard 1 Sylvie Calabretto 1 Béatrice Rumpler 1 
1 DRIM - Distribution, Recherche d'Information et Mobilité
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : Nowadays, semantics is one of the greatest challenges in IR systems evolution, as well as when it comes to (semi-)structured IR systems which are considered here. Usually, this challenge needs an additional external semantic resource related to the documents collection. In order to compare concepts and from a wider point of view to work with semantic resources, it is necessary to have semantic similarity measures. Similarity measures assume that concepts related to the terms have been identified without ambiguity. Therefore, misspelled terms interfere in term to concept matching process. So, existing semantic aware (semi-)structured IR systems lay on basic concept identification but don’t care about terms spelling uncertainty. We choose to deal with this last aspect and we suggest a way to detect and correct misspelled terms through a fuzzy semantic weighting formula which can be integrated in an IR system. In order to evaluate expected gains, we have developed a prototype which first results on small datasets seem interesting.
Document type :
Conference papers
Complete list of metadata
Contributor : Équipe gestionnaire des publications SI LIRIS Connect in order to contact the contributor
Submitted on : Friday, October 14, 2016 - 2:45:48 PM
Last modification on : Thursday, February 10, 2022 - 9:18:03 AM


  • HAL Id : hal-01381449, version 1


Arnaud Renard, Sylvie Calabretto, Béatrice Rumpler. Fuzzy Semantic Matching in (Semi-)Structured XML Documents : Indexation of Noisy Documents. 6th International Conference on Web Information Systems and Technologies (WEBIST 2010), Apr 2010, Valencia, Spain. pp.253-260. ⟨hal-01381449⟩



Record views