Query Log Driven Web Search Results Clustering

Jose G Moreno 1 Gaël Dias 1 Guillaume Cleuziou 2
1 Equipe Hultech - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
Abstract : recently shown increasing performances motivated by the use of external resources. Following this trend, we present a new algo- rithm called Dual C-Means, which provides a theoretical back- ground for clustering in different representation spaces. Its origi- nality relies on the fact that external resources can drive the cluster- ing process as well as the labeling task in a single step. To validate our hypotheses, a series of experiments are conducted over differ- ent standard datasets and in particular over a new dataset built from the TREC Web Track 2012 to take into account query logs infor- mation. The comprehensive empirical evaluation of the proposed approach demonstrates its significant advantages over traditional clustering and labeling techniques.
Type de document :
Communication dans un congrès
37th Annual ACM SIGIR Conference (SIGIR 2014), Jul 2014, Gold Coast, Australia. 10p., 2014
Liste complète des métadonnées

Littérature citée [41 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01070306
Contributeur : Greyc Référent <>
Soumis le : mercredi 1 octobre 2014 - 09:34:04
Dernière modification le : jeudi 7 février 2019 - 16:44:40
Document(s) archivé(s) le : vendredi 2 janvier 2015 - 10:35:11

Fichier

ACTI-MORENO-2014-2.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01070306, version 1

Citation

Jose G Moreno, Gaël Dias, Guillaume Cleuziou. Query Log Driven Web Search Results Clustering. 37th Annual ACM SIGIR Conference (SIGIR 2014), Jul 2014, Gold Coast, Australia. 10p., 2014. 〈hal-01070306〉

Partager

Métriques

Consultations de la notice

146

Téléchargements de fichiers

331