Combining Vector Space Model and Multi Word Term Extraction for Semantic Query Refinement.

Abstract : Inthispaper,wetargetdocumentrankinginahighlytechni- cal field with the aim to approximate a ranking that is obtained through an existing ontology (knowledge structure). We test and combine sym- bolic and vector space models (VSM). Our symbolic approach relies on shallow NLP and on internal linguistic relations between Multi-Word Terms (MWTs). Documents are ranked based on different semantic rela- tions they share with the query terms, either directly or indirectly after clustering the MWTs using the identified lexico-semantic relations. The VSM approach consisted in ranking documents with different functions ranging from the classical tf.idf to more elaborate similarity functions. Results shows that the ranking obtained by the symbolic approach per- forms better on most queries than the vector space model. However, the ranking obtained by combining both approaches outperforms by a wide margin the results obtained by methods from each approach.
Type de document :
Communication dans un congrès
Zoubida Kedad et al. 12th International Conference on Applications of Natural Language to Information systems (NLDB 2007)., Jun 2007, Paris, France. Springer, 4592/2007, pp.252-263, 2007, Lecture Notes in Computer Science. <10.1007/978-3-540-73351-5>


https://hal.archives-ouvertes.fr/hal-00636105
Contributeur : Fidelia Ibekwe-Sanjuan <>
Soumis le : mercredi 2 novembre 2011 - 19:32:25
Dernière modification le : mercredi 23 mars 2016 - 09:48:34
Document(s) archivé(s) le : vendredi 3 février 2012 - 02:21:08

Fichier

NLDB-07-last-version.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Eric Sanjuan, Fidelia Ibekwe-Sanjuan, Torres-Moreno Juan-Manuel, Patricia Velazquez-Morales. Combining Vector Space Model and Multi Word Term Extraction for Semantic Query Refinement.. Zoubida Kedad et al. 12th International Conference on Applications of Natural Language to Information systems (NLDB 2007)., Jun 2007, Paris, France. Springer, 4592/2007, pp.252-263, 2007, Lecture Notes in Computer Science. <10.1007/978-3-540-73351-5>. <hal-00636105>

Exporter

Partager

Métriques

Consultations de
la notice

295

Téléchargements du document

142