Advances in Feature Selection with Mutual Information

Abstract : The selection of features that are relevant for a prediction or classification problem is an important problem in many domains involving high-dimensional data. Selecting features helps fighting the curse of dimensionality, improving the performances of prediction or classification methods, and interpreting the application. In a nonlinear context, the mutual information is widely used as relevance criterion for features and sets of features. Nevertheless, it suffers from at least three major limitations: mutual information estimators depend on smoothing parameters, there is no theoretically justified stopping criterion in the feature selection greedy procedure, and the estimation itself suffers from the curse of dimensionality. This chapter shows how to deal with these problems. The two first ones are addressed by using resampling techniques that provide a statistical basis to select the estimator parameters and to stop the search procedure. The third one is addressed by modifying the mutual information criterion into a measure of how features are complementary (and not only informative) for the problem at hand.
Type de document :
Chapitre d'ouvrage
Villmann, Th.; Biehl, M.; Hammer, B.; Verleysen, M. Similarity-Based Clustering, Springer Berlin / Heidelberg, pp.52-69, 2009, Lecture Notes in Computer Science, <10.1007/978-3-642-01805-3_4>
Liste complète des métadonnées


https://hal.archives-ouvertes.fr/hal-00413154
Contributeur : Fabrice Rossi <>
Soumis le : jeudi 3 septembre 2009 - 12:37:03
Dernière modification le : jeudi 9 février 2017 - 15:20:05
Document(s) archivé(s) le : mardi 15 juin 2010 - 23:07:50

Fichiers

paperMV.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Michel Verleysen, Fabrice Rossi, Damien François. Advances in Feature Selection with Mutual Information. Villmann, Th.; Biehl, M.; Hammer, B.; Verleysen, M. Similarity-Based Clustering, Springer Berlin / Heidelberg, pp.52-69, 2009, Lecture Notes in Computer Science, <10.1007/978-3-642-01805-3_4>. <hal-00413154>

Partager

Métriques

Consultations de
la notice

207

Téléchargements du document

110