Interprétation vague des contraintes structurelles pour la RI dans des corpus de documents XML ; Evaluation d'une méthode approchée de RI structurée
Résumé
We propose specific data structures designed to the indexing and retrieval of information elements in heterogeneous XML data bases. The indexing scheme is well suited to the management of various contextual searches, expressed either at a structural level or at an information content level. The approximate search mechanisms are based on a modified Levenshtein editing distance and information fusion heuristics. The implementation described highlights the mixing of structured information presented as field/value instances and free text elements. The retrieval performances of the proposed approach are evaluated within the INEX 2005 evaluation campaign. The evaluation results rank the proposed approach among the best evaluated XML IR systems for the VVCAS task.
Origine : Fichiers éditeurs autorisés sur une archive ouverte