Clustering of Variables for Mixed Data

Jerome Saracco; Marie Chavent

Chapitre D'ouvrage Année : 2016

Clustering of Variables for Mixed Data

(1, 2, 3) , (1, 2)

1
2
3

Jerome Saracco

Fonction : Auteur
PersonId : 12086
IdHAL : jerome-saracco
ORCID : 0000-0003-4198-4002
IdRef : 131355201

Quality control and dynamic reliability

Institut de Mathématiques de Bordeaux

Ecole Nationale Supérieure de Cognitique

Marie Chavent

Fonction : Auteur
PersonId : 14809
IdHAL : marie-chavent
ORCID : 0000-0001-6774-597X
IdRef : 081889356

Quality control and dynamic reliability

Institut de Mathématiques de Bordeaux

Résumé

This chapter presents clustering of variables which aim is to lump together strongly related variables. The proposed approach works on a mixed data set, i.e. on a data set which contains numerical variables and categorical variables. Two algorithms of clustering of variables are described: a hierarchical clustering and a k-means type clustering. A brief description of PCAmix method (that is a principal component analysis for mixed data) is provided, since the calculus of the synthetic variables summarizing the obtained clusters of variables is based on this multivariate method. Finally, the R packages {\bf ClustOfVar} and {\bf PCAmixdata} are illustrated on real mixed data. The PCAmix (resp. ClustOfVar) approach is first used for dimension reduction (step1) before standard clustering of the individuals (step 2).

Domaines

Statistiques [stat]

Jerome Saracco : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01417442

Soumis le : jeudi 15 décembre 2016-16:12:07

Dernière modification le : jeudi 4 avril 2024-03:07:58

Dates et versions

hal-01417442 , version 1 (15-12-2016)

Identifiants

HAL Id : hal-01417442 , version 1

Citer

Jerome Saracco, Marie Chavent. Clustering of Variables for Mixed Data. Statistics for Astrophysics: Clustering and Classification, 77, EDP Sciences, pp.91-119, 2016, EAS Publications Series, 978-2-7598-9001-9. ⟨hal-01417442⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA IMB INRIA2

104 Consultations

0 Téléchargements

Clustering of Variables for Mixed Data

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager