From Sparse Regression to Sparse Multiple Correspondence Analysis - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2013

From Sparse Regression to Sparse Multiple Correspondence Analysis

Résumé

High dimensional data means that the number of variables p if far larger than the number of observations n. This talk starts from a survey of various solutions in linear regression . When p n the OLS estimator does not exist . Since it is a case of forced multi- collinearity, one may use regularized techniques such as ridge regression, principal component regression or PLS regression which keep all the predictors. However if p n combinations of all variables cannot be interpreted. Sparse so- lutions, ie with a large number of zero coe cients, are preferred. Lasso, elastic net, sparse PLS perform simultaneously regularization and variable selection thanks to non quadratic penalties: L1, SCAD etc. In PCA, the singular value decomposition shows that if we regress principal com- ponents onto the input variables, the vector of regression coe cients is equal to the factor loadings. It su ces to adapt sparse regression techniques to get sparse ver- sions of PCA and of PCA with groups of variables. We conclude by a presentation of a sparse version of Multiple Correspondence Analysis and give several applications.
Saporta_Luxembourg2.pdf (1.56 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01126275 , version 1 (09-12-2020)

Identifiants

  • HAL Id : hal-01126275 , version 1

Citer

Gilbert Saporta. From Sparse Regression to Sparse Multiple Correspondence Analysis. European Conference on Data Analysis, Jul 2013, Luxembourg, Luxembourg. pp.25. ⟨hal-01126275⟩
95 Consultations
14 Téléchargements

Partager

Gmail Facebook X LinkedIn More