Grouped variable importance with random forests and application to multiple functional data analysis - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2015

Grouped variable importance with random forests and application to multiple functional data analysis

Résumé

The selection of grouped variables using the random forest algorithm is considered. First a new importance measure adapted for groups of variables is proposed. Theoretical insights into this criterion are given for additive regression models. Second, an original method for selecting functional variables based on the grouped variable importance measure is developed. Using a wavelet basis, it is proposed to regroup all of the wavelet coefficients for a given functional variable and use a wrapper selection algorithm with these groups. Various other groupings which take advantage of the frequency and time localization of the wavelet basis are proposed. An extensive simulation study is performed to illustrate the use of the grouped importance measure in this context. The method is applied to a real life problem coming from aviation safety.
Fichier principal
Vignette du fichier
Grouped_Variable_Importance_revised.pdf (544.2 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01084301 , version 1 (18-11-2014)
hal-01084301 , version 2 (09-04-2015)

Identifiants

  • HAL Id : hal-01084301 , version 2

Citer

Baptiste Gregorutti, Bertrand Michel, Philippe Saint-Pierre. Grouped variable importance with random forests and application to multiple functional data analysis. 2015. ⟨hal-01084301v2⟩
370 Consultations
1458 Téléchargements

Partager

Gmail Facebook X LinkedIn More