On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of Multivariate Analysis Année : 2010

On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification

Résumé

Let X-1 X be identically distributed random vectors in R-d, independently drawn according to some probability density. An observation Xi is said to be a layered nearest neighbour (LNN) of a point x if the hyperrectangle defined by x and Xi contains no other data points. We first establish consistency results on L(x), the number of LNN of x. Then, given a sample (X, Y), (X-1, Y-1),, (X-n, Y-n) of independent identically distributed random vectors from Rd x R, one may estimate the regression function r(x) = E[Y] X = x by the LNN estimate r(n)(x), defined as an average over the Y's corresponding to those X, which are LNN of x. Under mild conditions on r, we establish the consistency of El r (x) r(x) towards 0 as n -> infinity, for almost all x and all p >= 1, and discuss the links between r and the random forest estimates of Breiman (2001) [8]. We finally show the universal consistency of the bagged (bootstrap-aggregated) nearest neighbour method for regression and classification

Dates et versions

hal-00559811 , version 1 (26-01-2011)

Identifiants

Citer

Gérard Biau, L. Devroye. On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification. Journal of Multivariate Analysis, 2010, 101 (10), pp.2499-2518. ⟨10.1016/j.jmva.2010.06.019⟩. ⟨hal-00559811⟩
45 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More