# Sélection optimale de modèles à base localisée en régression hétéroscédastique

Abstract : The concept of localized basis is a useful tool in linear approximation theory, as it unifies histograms, piecewise polynomials and wavelets. We thus consider the selection of the order of a linear expansion into a localized basis, for the nonparametric estimation of a regression function, with random design and heteroscedastic noise. We formulate oracle inequalities that prove the asymptotic optimality of the so-called slope heuristics and a $V$-fold penalization procedure. Furthermore, we show that the classical $V$-fold cross-validation procedure is asymptotically suboptimal as it produces an estimate that converges to the oracle corresponding to a fraction of the initial sample equal to (V-1)/V. We conclude the presentation with a simulation study on wavelets. Particularly, we exhibit a gap between the asymptotic theoretic results and the practice at a finite horizon. Indeed, $V$-fold cross-validation and penalization give similar results on our experiments, although the asymptotic superiority of the penalization scheme is proved.
Conference papers

### Citation

Fabien Navarro, Adrien Saumard. Sélection optimale de modèles à base localisée en régression hétéroscédastique. 48èmes Journées de Statistique de la SFdS, May 2016, Montpellier, France. ⟨hal-01383777⟩

