Choosing a penalty for model selection in heteroscedastic regression

Sylvain Arlot 1, 2
2 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : We consider the problem of choosing between several models in least-squares regression with heteroscedastic data. We prove that any penalization procedure is suboptimal when the penalty is a function of the dimension of the model, at least for some typical heteroscedastic model selection problems. In particular, Mallows' Cp is suboptimal in this framework. On the contrary, optimal model selection is possible with data-driven penalties such as resampling or $V$-fold penalties. Therefore, it is worth estimating the shape of the penalty from data, even at the price of a higher computational cost. Simulation experiments illustrate the existence of a trade-off between statistical accuracy and computational complexity. As a conclusion, we sketch some rules for choosing a penalty in least-squares regression, depending on what is known about possible variations of the noise-level.
Document type :
Preprints, Working Papers, ...
2010
Liste complète des métadonnées


https://hal.archives-ouvertes.fr/hal-00347811
Contributor : Sylvain Arlot <>
Submitted on : Thursday, June 3, 2010 - 7:24:45 PM
Last modification on : Thursday, September 29, 2016 - 1:22:04 AM
Document(s) archivé(s) le : Thursday, September 23, 2010 - 12:56:52 PM

Files

shape.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00347811, version 2
  • ARXIV : 0812.3141

Collections

Citation

Sylvain Arlot. Choosing a penalty for model selection in heteroscedastic regression. 2010. <hal-00347811v2>

Share

Metrics

Record views

476

Document downloads

146