Search and Aggregation in XML Documents - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Search and Aggregation in XML Documents

Abdelmalek Habi
Brice Effantin

Résumé

Information retrieval encounters a migration from the traditional paradigm (returning an ordered list of responses) to the aggregate search paradigm (grouping the most comprehensive and relevant answers into one final aggregated document). Nowadays extensible markup language (XML) is an important standard of information exchange and representation. Usually the tree representation of documents and queries is used to process them. It allows to consider the XML documents retrieval as a tree matching problem between the document trees and the query tree. Several paradigms for retrieving XML documents have been proposed in the literature but only a few of them try to aggregate a set of XML documents in order to provide more significant answers for a given query. In this paper, we propose and evaluate an aggregated search method to obtain the most accurate and richest answers in XML fragment search. Our search method is based on the Top-k Approximate Subtree Matching (TASM) algorithm and a new similarity function is proposed to improve the returned fragments. Then an aggregation process is presented to generate a single aggregate response containing the most relevant, exhaustive and non-redundant information given by the fragments. The method is evaluated on two real world datasets. Experimentations show that it generates good results in terms of relevance and quality.
Fichier non déposé

Dates et versions

hal-01590605 , version 1 (19-09-2017)

Identifiants

  • HAL Id : hal-01590605 , version 1

Citer

Abdelmalek Habi, Brice Effantin, Hamamache Kheddouci. Search and Aggregation in XML Documents. 28th International Conference on Database and Expert Systems Applications, Aug 2017, Lyon, France. pp.290-304. ⟨hal-01590605⟩
158 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More