HAL: hal-00337058, version 3
 arXiv: 0811.0802
 Available versions: v1 (2008-11-05) v2 (2009-04-14) v3 (2012-04-02)
 Optimal cross-validation in density estimation
 (2008-10-10)
 The performance of cross-validation (CV) is analyzed in two contexts: (i) risk estimation and (ii) model selection in the density estimation framework. The main focus is given to one CV algorithm called leave-$p$-out (Lpo), where $p$ denotes the cardinality of the test set. Closed-form expressions are settled for the Lpo estimator of the risk of projection estimators, which makes V-fold cross-validation completely useless. From a theoretical point of view, these closed-form expressions enable to study the Lpo performances in terms of risk estimation. For instance, the optimality of leave-one-out (Loo), that is Lpo with $p=1$, is proved among CV procedures. Two model selection frameworks are also considered: estimation, as opposed to identification. Unlike risk estimation, Loo is proved to be suboptimal as a model selection procedure. In the estimation framework with finite sample size $n$, optimality is achieved for $p$ large enough (with $p/n =o(1)$) to balance overfitting. A link is also identified between the optimal $p$ and the structure of the model collection. These theoretical results are strongly supported by simulation experiments. When performing identification, model consistency is also proved for Lpo with $p/n\to 1$ as $n\to +\infty$.
 1: Laboratoire de Mathématiques Paul Painlevé CNRS : UMR8524 – Université Lille I - Sciences et technologies
 Subject : Mathematics/StatisticsStatistics/Statistics Theory
 Keyword(s): Cross-validation – leave-p-out – resampling – risk estimation – model selection – density estimation – oracle inequality – projection estimators – concentration inequalities
Attached file list to this document:
 PDF
 cvhistoAOS_HAL.pdf(485.5 KB)
 PS
 cvhistoAOS_HAL.ps(1.5 MB)
 hal-00337058, version 3 http://hal.archives-ouvertes.fr/hal-00337058 oai:hal.archives-ouvertes.fr:hal-00337058 From: Alain Celisse <> Submitted on: Friday, 30 March 2012 17:08:28 Updated on: Monday, 2 April 2012 23:25:55