Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data

Caroline Bazzoli 1 Sophie Lambert-Lacroix 2
1 SVH - Statistique pour le Vivant et l’Homme
LJK - Laboratoire Jean Kuntzmann
2 TIMC-IMAG-BCM - Biologie Computationnelle et Mathématique
TIMC-IMAG - Techniques de l'Ingénierie Médicale et de la Complexité - Informatique, Mathématiques et Applications, Grenoble - UMR 5525
Abstract : Prediction from high-dimensional genomic data is an active field in today's medical research. Most of the proposed prediction methods make use of genomic data alone without considering established clinical data that often are available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may improve predictions. We consider in this paper methods for classification purposes that simultaneously use both types of variables, but applying dimension reduction only to the high-dimensional genomic ones. A usual way to deal with that is the use of a two-step approach. In step one, dimensionality reduction technique is just performed on the genomic dataset. In step two, the selected genomic variables are merged with the clinical variables to build a classification model on the combined dataset. Nevertheless, the reduction dimension is built without taking into account the link between the response variable and the clinical data. To address this issue, using Partial Least Squares (PLS) as reduction technique, we propose here a one step approach based on three extensions of LS-PLS (LS for Least Squares) method for logistic regression context. We perform a simulation study to evaluate these approaches compared to methods using only the clinical data or only genetic data. Then, we illustrate their performances to classify two real data sets containing both clinical information and gene expression.
Complete list of metadatas

Cited literature [35 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01405101
Contributor : Caroline Bazzoli <>
Submitted on : Monday, October 15, 2018 - 4:26:19 PM
Last modification on : Tuesday, September 24, 2019 - 4:22:05 PM
Long-term archiving on: Wednesday, January 16, 2019 - 3:45:07 PM

File

Bazzoli_et_al-2018-BMC_Bioinfo...
Publisher files allowed on an open archive

Identifiers

Collections

Citation

Caroline Bazzoli, Sophie Lambert-Lacroix. Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data. BMC Bioinformatics, BioMed Central, 2018, 19 (1), ⟨10.1186/s12859-018-2311-2⟩. ⟨hal-01405101v3⟩

Share

Metrics

Record views

152

Files downloads

268