Co-clustering Documents and Words by Minimizing the Normalized Cut Objective Function

Charles-Edmond Bichot 1
1 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : This paper follows a word-document co-clustering model independently introduced in 2001 by several authors such as I.S. Dhillon, H. Zha and C. Ding. This model consists in creating a bipartite graph based on word frequencies in documents, and whose vertices are both documents and words. The created bipartite graph is then partitioned in a way that minimizes the normalized cut objective function to produce the document clustering. The fusion-fission graph partitioning metaheuristic is applied on several document collections using this word-document co-clustering model. Results demonstrate a real problem in this model: partitions found almost always have a normalized cut value lowest than the original document collection clustering. Moreover, measures of the goodness of solutions seem to be relatively independent of the normalized cut values of partitions.
Type de document :
Article dans une revue
Journal of Mathematical Modelling and Algorithms, Springer Verlag, 2010, 2, 9, pp.131-147. 〈10.1007/s10852-010-9126-0〉
Liste complète des métadonnées

Littérature citée [31 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01381475
Contributeur : Équipe Gestionnaire Des Publications Si Liris <>
Soumis le : lundi 6 mars 2017 - 16:38:47
Dernière modification le : vendredi 10 novembre 2017 - 01:20:17
Document(s) archivé(s) le : mercredi 7 juin 2017 - 15:32:00

Fichier

Liris-4669.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Charles-Edmond Bichot. Co-clustering Documents and Words by Minimizing the Normalized Cut Objective Function. Journal of Mathematical Modelling and Algorithms, Springer Verlag, 2010, 2, 9, pp.131-147. 〈10.1007/s10852-010-9126-0〉. 〈hal-01381475〉

Partager

Métriques

Consultations de la notice

100

Téléchargements de fichiers

85