| HAL : hal-00125455, version 2 |
| arXiv : math.ST/0701542 |
| Fiche détaillée | Récupérer au format |
|
|
| Versions disponibles : | v1 (19-01-2007) | v2 (22-01-2007) |
|
|
|
|
| Model selection by resampling penalization |
|
|
| Sylvain Arlot 1 |
|
|
| (19/01/2007) |
|
|
| We present a new family of model selection algorithms based on the resampling heuristics. It can be used in several frameworks, do not require any knowledge about the unknown law of the data, and may be seen as a generalization of local Rademacher complexities and $V$-fold cross-validation. In the case example of least-square regression on histograms, we prove oracle inequalities, and that these algorithms are naturally adaptive to both the smoothness of the regression function and the variability of the noise level. Then, interpretating $V$-fold cross-validation in terms of penalization, we enlighten the question of choosing $V$. Finally, a simulation study illustrates the strength of resampling penalization algorithms against some classical ones, in particular with heteroscedastic data. |
|
|
|
|
|
|
|
|
|
|
| 1 : | Laboratoire de Mathématiques d'Orsay (LM-Orsay) |
| CNRS : UMR8628 – Université Paris XI - Paris Sud | |
|
|
|
|
|
|
|
|
| Domaine | : | Mathématiques/Statistiques Statistiques/Théorie |
|
|
| resampling – V-fold cross-validation – regression – model selection – oracle inequality – adaptivity – heteroscedastic data |
|
|
| Liste des fichiers attachés à ce document : | ||||||||||
|
|
|
| hal-00125455, version 2 | |
| http://hal.archives-ouvertes.fr/hal-00125455 | |
| oai:hal.archives-ouvertes.fr:hal-00125455 | |
| Contributeur : Sylvain Arlot | |
| Soumis le : Lundi 22 Janvier 2007, 11:41:45 | |
| Dernière modification le : Jeudi 7 Février 2008, 21:52:51 | |