Density estimation via cross-validation: Model selection point of view - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2008

Density estimation via cross-validation: Model selection point of view

Alain Celisse

Résumé

The problem of density estimation by cross-validation is addressed in the model selection frame- work. Extensively used in practice, cross-validation remains poorly understood, especially in the non-asymptotic setting. More precisely, our analysis mainly focuses on a cross-validation based algo- rithm named leave-p-out. Our better understanding of the leave-p-out with respect to the cardinality p of the test set yields more insight into other cross-validation based algorithms. From a general point of view, cross-validation is devoted to estimate the risk of an estimator. Usually due to a prohibitive computational complexity, the leave-p-out is taken for intractable. However, we turned it into a fully effective procedure thanks to closed-form formulas for the risk estimator of a wide range of widespread estimators. Embedding leave-p-out in the model selection setting enables a new interpretation of this algorithm in terms of a penalized criterion, with a random penalty. Furthermore, the amount of overpenalization it provides turns out to increase with p. A theoretical assessment of the leave-p-out performance is provided thanks to two oracle inequalities applying respectively to either bounded densities or square integrable ones. With different sieves such as piecewise constant functions or trigonometric and dyadic polynomials, the leave-p-out based strategy exhibits some adaptivity properties in the minimax sense with respect to Hölder as well as Besov spaces.
Fichier principal
Vignette du fichier
OraclePoly.pdf (407.02 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00337058 , version 1 (05-11-2008)
hal-00337058 , version 2 (14-04-2009)
hal-00337058 , version 3 (30-03-2012)

Identifiants

Citer

Alain Celisse. Density estimation via cross-validation: Model selection point of view. 2008. ⟨hal-00337058v1⟩
292 Consultations
212 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More