A Sparse PLS for Variable Selection when Integrating Omics data - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Statistical Applications in Genetics and Molecular Biology Année : 2008

A Sparse PLS for Variable Selection when Integrating Omics data

Résumé

Recent biotechnology advances allow for the collection of multiple types of omics data sets, such as transcriptomic, proteomic or metabolomic data to be integrated. The problem of feature selection has been addressed several times in the context of classification, but has to be handled in a specific manner when integrating data. In this study, we focus on the integration of two-block data sets that are measured on the same samples. Our goal is to combine integration and simultaneous variable selection on the two data sets in a one-step procedure using a PLS variant to facilitate the biologists interpretation. A novel computational methodology called "sparse PLS" is introduced for a predictive purpose analysis to deal with these newly arisen problems. The sparsity of our approach is obtained by soft-thresholding penalization of the loading vectors during the SVD decomposition. Sparse PLS is shown to be effective and biologically meaningful. Comparisons with classical PLS are performed on simulated and real data sets and a thorough biological interpretation of the results obtained on one data set is provided. We show that sparse PLS provides a valuable variable selection tool for high dimensional data sets

Mots clés

Fichier principal
Vignette du fichier
modifs.pdf (540.92 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00300204 , version 1 (17-07-2008)
hal-00300204 , version 2 (23-09-2008)

Identifiants

Citer

Kim-Anh Lê Cao, Debra Rossow, Christèle Robert-Granié, Philippe Besse. A Sparse PLS for Variable Selection when Integrating Omics data. Statistical Applications in Genetics and Molecular Biology, 2008, 7 (1), pp.35. ⟨10.2202/1544-6115.1390⟩. ⟨hal-00300204v2⟩
398 Consultations
3846 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More