Why V=5 is enough in V-fold cross-validation - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2014

Why V=5 is enough in V-fold cross-validation

Résumé

This paper studies V-fold cross-validation for model selection in least-squares density estimation. The goal is to provide theoretical grounds for choosing V in order to minimize the least-squares loss of the selected estimator. We first prove a non asymptotic oracle inequality for V-fold cross-validation and its bias-corrected version (V-fold penalization). In particular, this result implies V-fold penalization is asymptotically optimal. Then, we compute the variance of V-fold cross-validation and related criteria, as well as the variance of key quantities for model selection performance. We show these variances depend on V like 1+4/(V-1) (at least in some particular cases), suggesting the performance increases much from V=2 to V=5 or 10, and then is almost constant. Overall, this explains the common advice to take V=5---at least in our setting and when the computational power is limited---, as confirmed by some simulation experiments.
Fichier principal
Vignette du fichier
penvfreech6.pdf (1.18 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00743931 , version 1 (22-10-2012)
hal-00743931 , version 2 (17-07-2014)
hal-00743931 , version 3 (09-10-2015)

Identifiants

Citer

Sylvain Arlot, Matthieu Lerasle. Why V=5 is enough in V-fold cross-validation. 2014. ⟨hal-00743931v2⟩
803 Consultations
495 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More