HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Non linear robust regression in high dimension

Emeline Perthame 1 Florence Forbes 1 Brice Olivier 1 Antoine Deleforge 2
1 MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
Inria Grenoble - Rhône-Alpes, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology, LJK - Laboratoire Jean Kuntzmann
2 PANAMA - Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio
Abstract : The analysis of spectra data deduced from proteomics studies in biology or infrared measures in chemometrics often involves non linear relationships between variable responses and covariates. In this framework, non linear regression is used to model this kind of complex relations between the target and a possibly large number of features. As in linear regression, observations are often supposed to be normally distributed. Nevertheless, under this assumption, outliers are known to affect the stability of the results and can lead to misleading predictions. In this setting, robust approaches that are tractable in high dimension are needed in order to improve the accuracy of linear or non-linear regression methods under the presence of outliers. In the proposed method, non linearity is handled via a mixture of regressions. Mixture models and paradoxically also the so-called mixture of regression models are mostly used to handle clustering issues and few articles refer to mixture models for actual regression and prediction purposes. Interestingly, it was shown in (Deleforge et al., 2015 [1]) that a prediction approach based on mixture of regressions in a Gaussian setting was relevant. However, the method developed by these authors is not designed to perform robust regression. Therefore, we build on the work in [1]  by considering mixture of Student distributions that are able to handle outliers. As in [1], we propose to handle high-dimensional data by using an inverse regression trick. However, in the Student mixture context, a joint modelling approach on both responses and regressors is necessary in order to guarantee the tractability of the inverse regression of interest. Furthermore, under the SLLiM setting, the observed response variables can be complemented by adding latent variables, which are able to catch dependence among covariates and give better prediction rates. For both instances of the model, parameter estimation can be performed via an EM algorithm which remains numerically feasible when the number of variables exceeds the number of observations. Intensive simulations, both on illustrative and more complex examples in high dimension, demonstrate that the proposed model performs well in this setting. Application of SLLiM on real datasets is also illustrated.
Document type :
Conference papers
Complete list of metadata

Contributor : Florence Forbes Connect in order to contact the contributor
Submitted on : Friday, December 30, 2016 - 4:51:24 PM
Last modification on : Wednesday, November 3, 2021 - 7:49:47 AM
Long-term archiving on: : Monday, March 20, 2017 - 10:05:05 PM


Files produced by the author(s)


  • HAL Id : hal-01423622, version 1


Emeline Perthame, Florence Forbes, Brice Olivier, Antoine Deleforge. Non linear robust regression in high dimension. The XXVIIIth International Biometric Conference, Jul 2016, Victoria, Canada. ⟨hal-01423622⟩



Record views


Files downloads