Sparse principal component analysis for multiblock data and its extension to sparse multiple correspondence analysis - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Sparse principal component analysis for multiblock data and its extension to sparse multiple correspondence analysis

Résumé

Two new methods to select groups of variables have been developed for multiblock data: "Group Sparse Principal Component Analysis" (GSPCA) for continuous variables and "Sparse Multiple Correspondence Analysis" (SMCA) for categorical variables. GSPCA is a compromise between Sparse PCA method of Zou, Hastie and Tibshirani and the method "group Lasso" of Yuan and Lin. PCA is formulated as a regression-type optimization problem and uses the constraints of the group Lasso on regression coe cients to produce modi ed principal components with sparse loadings. It leads to reduce the number of nonzero coe cients, i.e. the number of selected groups. SMCA is a straightforward extension of GSPCA to groups of indicator variables, with the chi-square metric. Two real examples will be used to illustrate each method. The fi rst one is a data set on 25 trace elements measured in three tissues of 48 crabs (25 blocks of 3 variables). The second one is a data set of 502 women aimed at the identi cation of genes a ecting skin aging with more than 370.000 blocks, each block corresponding to SNPs (Single Nucleotide Polymorphisms) coded into 3 categories.
Fichier principal
Vignette du fichier
art_2625.pdf (493.72 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01126171 , version 1 (24-03-2020)

Identifiants

  • HAL Id : hal-01126171 , version 1

Citer

Anne Bernard, Christiane Guinot, Gilbert Saporta. Sparse principal component analysis for multiblock data and its extension to sparse multiple correspondence analysis. Compstat 2012, International Association for Statistical Computing, Aug 2012, Limassol, Cyprus. pp.99-106. ⟨hal-01126171⟩
184 Consultations
194 Téléchargements

Partager

Gmail Facebook X LinkedIn More