M. Aerts, G. Claeskens, and J. D. Hart, Testing the Fit of a Parametric Function, Ada05] Radoslaw Adamczak. Moment inequalities for u-statistics, pp.869-879, 1999.
DOI : 10.1093/biomet/77.3.642

H. Akaike, Information theory and an extension of the maximum likelihood principle, Second International Symposium on Information Theory (Tsahkadsor, pp.267-281, 1971.

M. David and . Allen, The relationship between variable selection and data augmentation and a method for prediction, Technometrics, vol.16, pp.125-127, 1974.

E. Alpaydin, Combined 5 ?? 2 cv F Test for Comparing Supervised Classification Learning Algorithms, Neural Computation, vol.11, issue.8, pp.1885-1892, 1999.
DOI : 10.1162/089976698300017197

S. Arlot and P. Massart, Slope heuristics for heteroscedastic regression on a random design, 2008.

S. Arlot, Resampling and Model Selection, 2007.
URL : https://hal.archives-ouvertes.fr/tel-00198803

Y. Baraud, Model selection for regression on a fixed design. Probab. Theory Related Fields, pp.467-493, 2000.

S. Boucheron, O. Bousquet, G. Lugosi, and P. Massart, Moment inequalities for functions of independent random variables, The Annals of Probability, vol.33, issue.2, pp.514-560, 2005.
DOI : 10.1214/009117904000000856

URL : https://hal.archives-ouvertes.fr/hal-00101850

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and regression trees Wadsworth Statistics/Probability Series, 1984.

L. Birgé and P. Massart, Minimal penalties for gaussian model selection. Probab. Theory Related Fields, 2006.

L. Breiman, Heuristics of instability and stabilization in model selection, The Annals of Statistics, vol.24, issue.6, pp.2350-2383, 1996.
DOI : 10.1214/aos/1032181158

P. Burman, -fold cross-validation and the repeated learning-testing methods, Biometrika, vol.76, issue.3, pp.503-514, 1989.
DOI : 10.1093/biomet/76.3.503

URL : https://hal.archives-ouvertes.fr/hal-00819948

P. Burman, Estimation of optimal transformations using v-fold cross validation and repeated learning-testing methods. Sankhy¯ a Ser, pp.314-345, 1990.

P. Burman, Estimation of equifrequency histograms, Statistics & Probability Letters, vol.56, issue.3, pp.227-238, 2002.
DOI : 10.1016/S0167-7152(01)00059-1

A. Celisse and S. Robin, Non-parametric density estimation by exact leave-p-out crossvalidation, 2008.

P. Craven and G. Wahba, Smoothing noisy data with spline functions, Numerische Mathematik, vol.4, issue.4, pp.377-40379, 1978.
DOI : 10.1007/BF01404567

G. Thomas and . Dietterich, Approximate statistical tests for comparing supervised classification learning algorithms, Neur. Comp, vol.10, issue.7, pp.1895-1924, 1998.

L. David, I. M. Donoho, and . Johnstone, Adapting to unknown smoothness via wavelet shrinkage Balls and bins: a study in negative dependence, DR98] Devdatt Dubhashi and Desh Ranjan, pp.1200-122499, 1995.

B. Efron, Bootstrap Methods: Another Look at the Jackknife, The Annals of Statistics, vol.7, issue.1, pp.1-26, 1979.
DOI : 10.1214/aos/1176344552

B. Efron, Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation, Journal of the American Statistical Association, vol.78, issue.382, pp.316-331, 1983.
DOI : 10.1080/01621459.1983.10477973

S. Efromovich and M. Pinsker, Sharp-optimal and adaptive estimation for heteroscedastic nonparametric regression, Statist. Sinica, vol.6, issue.4, pp.925-942, 1996.

M. Fromont, Model selection by bootstrap penalization for classification, Machine Learning, vol.17, issue.2, pp.165-207, 2007.
DOI : 10.1007/s10994-006-7679-y

URL : https://hal.archives-ouvertes.fr/hal-00457774

S. Geisser, The Predictive Sample Reuse Method with Applications, Journal of the American Statistical Association, vol.36, issue.2, pp.320-328, 1975.
DOI : 10.1080/01621459.1975.10479865

L. Györfi, M. Kohler, A. Krzy?, H. Giné, R. L. Lata et al., A distribution-free theory of nonparametric regression Springer Series in Statistics Exponential and moment inequalities for U -statistics, High dimensional probability, pp.13-38, 1999.

L. Galtchouk, S. Pergamenshchikov, I. , P. Hastie, R. Tibshirani et al., Efficient adaptive nonparametric estimation in heteroscedastic models The elements of statistical learning. Springer Series in Statistics, Data mining, inference, and prediction. [JDP83] Kumar Joag-Dev and Frank Proschan. Negative association of random variables, 2001.

A. Robert and A. Lew, Bounds on negative moments Asymptotic optimality for Cp, CL, cross-validation and generalized cross-validation: discrete index set, Li87] Ker-Chau Li, pp.286-295728, 1976.

C. L. Mallows, Some comments on Cp, Technometrics, vol.15, pp.661-675, 1973.

P. Massart, Concentration inequalities and model selection Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, Lecture Notes in Mathematics, vol.1896, 2003.

M. David, M. A. Mason, and . Newton, A rank statistics approach to the consistency of a general bootstrap, Ann. Statist, vol.20, issue.3, pp.1611-1624, 1992.

A. M. Molinaro, R. Simon, and R. M. Pfeiffer, Prediction error estimation: a comparison of resampling methods, Bioinformatics, vol.21, issue.15, pp.3301-3307, 2005.
DOI : 10.1093/bioinformatics/bti499

J. Praestgaard and J. A. Wellner, Exchangeably Weighted Bootstraps of the General Empirical Process, The Annals of Probability, vol.21, issue.4, pp.2053-2086, 1993.
DOI : 10.1214/aop/1176989011

G. Schwarz, Estimating the Dimension of a Model, The Annals of Statistics, vol.6, issue.2, pp.461-464, 1978.
DOI : 10.1214/aos/1176344136

J. Shao, Linear Model Selection by Cross-validation, Journal of the American Statistical Association, vol.39, issue.422, pp.486-494, 1993.
DOI : 10.1080/03610927508827223

J. Shao, An asymptotic theory for linear model selection, Statist. Sinica, vol.7, issue.2, pp.221-264, 1997.

R. Shibata, An optimal selection of regression variables, Biometrika, vol.68, issue.1, pp.45-54, 1981.
DOI : 10.1093/biomet/68.1.45

M. Stone, Cross-validatory choice and assessment of statistical predictions, Hocking, and A. S. Young, and with a reply by the authors. [Sto85], pp.111-147, 1974.

C. J. Stone, An asymptotically optimal histogram selection rule, Proceedings of the Berkeley conference in honor of, pp.513-520, 1983.

J. Mark, S. Van-der-laan, S. Dudoit, and . Keles, Asymptotic optimality of likelihood-based cross-validation, Stat. Appl. Genet. Mol. Biol.Art, vol.3, issue.27, p.pp, 2004.

W. Aad, . Van, J. A. Vaart, and . Wellner, Weak convergence and empirical processes, 1996.

Y. Yang, Comparing learning methods for classification, Statist. Sinica, vol.16, issue.2, pp.635-657, 2006.

Y. Yang, Consistency of cross validation for comparing regression procedures Accepted by Annals of Statistics [Zha93] Ping Zhang. Model selection via multifold cross validation, Zni05] Marko?nidari?Marko?Marko?nidari?. Asymptotic expansions for inverse moments of binomial and poisson distributions, pp.299-313, 1993.