135 articles – 87 references  [version française]
HAL: hal-00492406, version 1

Detailed view  Export this paper
Available versions:
Simultaneous model-based clustering and visualization in the Fisher discriminative subspace
Charles Bouveyron 1, Camille Brunet 2
(2010-06-15)

Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which models the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the original space. By constraining model parameters within and between groups, a family of 8 parsimonious DLM models is exhibited and this allows to fit onto various situations. An estimation algorithm, called the Fisher-EM algorithm, is also proposed for estimating both the mixture parameters and the discriminative subspace. Experiments on simulated and real datasets show that the proposed approach outperforms existing clustering methods and provides a useful representation of the clustered data. The method is as well applied to the clustering of mass spectrometry data.
1:  Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) (SAMM)
Université Paris I - Panthéon-Sorbonne
2:  Informatique, Biologie Intégrative et Systèmes Complexes (IBISC)
CNRS : FRE3190 – Université d'Evry-Val d'Essonne
Mathematics/Statistics

Statistics/Statistics Theory
High-dimensional clustering – Model-based clustering – Discriminative subspace – Fisher criterion – Visualization – Parsimonious models.
Attached file list to this document: 
PDF
article_FisherEM.pdf(715.2 KB)