Generalization Error and Out-of-bag Bounds in Random (Uniform) Forests - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2015

Generalization Error and Out-of-bag Bounds in Random (Uniform) Forests

Résumé

In the context of ensemble learning, especially for random forests models, the out-of-bag (OOB) procedure, using the training set, produces an estimation of the generalization error. The OOB error has the same purpose than the cross-validation error, but comes with very specific points. First, there exists an OOB classifier that leads to the OOB evaluation. Second, the OOB classifier is embedded in the forest classifier. We show in this paper that these two intrinsic properties lead to produce simple conditions for the test error to be bounded by the OOB error. Conditions come with the only required and usual assumptions which are the i.i.d one and the existence of first and second order moments. The main interest is that the OOB error is explicitly known, hence one just needs a training set without any other assumption on the model behind the data. As a practical case, we use Random Uniform Forests (Ciss, 2015a), a variant of Random Forests (Breiman, 2001) that inherits of all properties of the latter, to show how OOB bounds apply. We also provide an R package, randomUniformForest, allowing to experiment all the arguments described in the paper.
Fichier principal
Vignette du fichier
OOBBoundsForTestError.pdf (439.15 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01110524 , version 1 (28-01-2015)
hal-01110524 , version 2 (16-02-2015)

Identifiants

  • HAL Id : hal-01110524 , version 2

Citer

Saïp Ciss. Generalization Error and Out-of-bag Bounds in Random (Uniform) Forests. 2015. ⟨hal-01110524v2⟩
254 Consultations
2590 Téléchargements

Partager

Gmail Facebook X LinkedIn More