Skip to Main content Skip to Navigation
Conference papers

Discussion on Importance of Variable Selection in PLS1 Modeling

Jie Wang 1 Huiwen Wang 1 Gilbert Saporta 2
2 CEDRIC - MSDMA - CEDRIC. Méthodes statistiques de data-mining et apprentissage
CEDRIC - Centre d'études et de recherche en informatique et communications
Abstract : The multicollinearity in the independent variable sets is harmful to OLS Regression. PLS Regression, invented by Wold, brought an important breakthrough to modeling under the condition of multicollinearity. PLSR enables modeling when the multicollinearity in independent variable sets exists or the sample size is smaller than the number of independent variables, and all independent variables can be involved in the regression model. Applying PLS Regression, some researchers believe that variable selection and multicollinearity could be neglected when using PLS Regression. And in some practical cases, tens, or even hundreds, of variables are involved in the regression model. This paper indicated that the multicollinearity in independent variable sets in PLS1 can obviously affect the deriving of components and the regression parameters. Thus it is necessary to select independent variables carefully before building PLS1 models; otherwise, the regression model can still lead to unexplainable results.
Complete list of metadatas

Cited literature [10 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01125387
Contributor : Laboratoire Cedric <>
Submitted on : Wednesday, March 25, 2020 - 6:47:45 PM
Last modification on : Monday, March 30, 2020 - 10:50:09 AM

File

RC1312.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01125387, version 1

Collections

Citation

Jie Wang, Huiwen Wang, Gilbert Saporta. Discussion on Importance of Variable Selection in PLS1 Modeling. PLS'07 5th Int. Symp. on PLS and related methods, Oslo, 2007, Oslo, Norway. ⟨hal-01125387⟩

Share

Metrics

Record views

139

Files downloads

7