Quantitative bounds for concentration-of-measure inequalities and empirical regression: the independent case

Abstract : This paper is devoted to the study of the deviation of the (random) average $L^{2}-$error associated to the least--squares regressor over a family of functions ${\cal F}_{n}$ (with controlled complexity) obtained from $n$ independent, but not necessarily identically distributed, samples of explanatory and response variables, from the minimal (deterministic) average $L^{2}-$error associated to this family of functions, and to some of the corresponding consequences for the problem of consistency. In the i.i.d. case, this specializes as classical questions on least--squares regression problems, but in more general cases, this setting permits a precise investigation in the direction of the study of nonasymptotic errors for least--squares regression schemes in nonstationary settings, which we motivate providing background and examples. More precisely, we prove first two nonasymptotic deviation inequalities that generalize and refine corresponding known results in the i.i.d. case. We then explore some consequences for nonasymptotic bounds of the error both in the weak and the strong senses. Finally, we exploit these estimates to shed new light into questions of consistency for least--squares regression schemes in the distribution--free, nonparametric setting. As an application to the classical theory, we provide in particular a result that generalizes the link between the problem of consistency and the Glivenko-Cantelli property, which applied to regression in the i.i.d. setting over non--decreasing families $({\cal F}_{n})_{n}$ of functions permits to create a scheme which is strongly consistent in $L^{2}$ under the sole (necessary) assumption of the existence of functions in $\cup_{n}{\cal F}_{n}$ which are arbitrarily close in $L^{2}$ to the corresponding regressor.
Type de document :
Pré-publication, Document de travail
2019
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01832195
Contributeur : Emmanuel Gobet <>
Soumis le : samedi 26 janvier 2019 - 15:28:30
Dernière modification le : vendredi 1 février 2019 - 01:14:12

Fichier

FinalVersionRevision_JoC_for_p...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01832195, version 2

Citation

David Barrera, Emmanuel Gobet. Quantitative bounds for concentration-of-measure inequalities and empirical regression: the independent case. 2019. 〈hal-01832195v2〉

Partager

Métriques

Consultations de la notice

79

Téléchargements de fichiers

26