SIRIUS XML IR System at INEX 2006: Approximate Matching of Structure and Textual Content - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2007

SIRIUS XML IR System at INEX 2006: Approximate Matching of Structure and Textual Content

Résumé

In this paper we report on the retrieval approach taken by the VALORIA laboratory of the University of South-Brittany while participating at INEX 2006 ad-hoc track with the SIRIUS XML IR system. SIRIUS retrieves relevant XML elements by approximate matching both the content and the structure of the XML documents. A weighted editing distance on XML paths is used to approximately match the documents structure while the IDF of the researched terms are used to rank the textual content of the retrieved elements. We briefly describe the approach and the extensions made to the SIRIUS XML IR system to address each of the four subtasks of the INEX 2006 ad-hoc track. Finally we present and analyze the SIRIUS retrieval evaluation results. SIRIUS runs were ranked on the 1st position out of 77 submitted runs for the Best In Context task and obtained several top ten results for both the Focused and All In Context tasks.
Fichier principal
Vignette du fichier
SIRIUSatINEX06_final.pdf (2.41 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-00493614 , version 1 (08-07-2010)

Identifiants

Citer

Eugen Popovici, Gildas Ménier, Pierre-François Marteau. SIRIUS XML IR System at INEX 2006: Approximate Matching of Structure and Textual Content. INEX: Initiative for the Evaluation of XML Retrieval (INEX 2006), Dec 2006, Dagstuhl, Germany. pp.185-199, ⟨10.1007/978-3-540-73888-6⟩. ⟨hal-00493614⟩
48 Consultations
91 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More