Unsupervised Feature Selection with Ensemble Learning

Haytham Elghazel 1 Alex Aussem 1
1 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : In this paper, we show that the way internal estimates are used to measure variable importance in Random Forests are also applicable to feature selection in unsupervised learning. We propose a new method called Random Cluster Ensemble (RCE for short), that estimates the out-of-bag feature importance from an ensemble of partitions. Each partition is constructed using a different bootstrap sample and a random subset of the features. We provide empirical results on nineteen benchmark data sets indicating that RCE, boosted with a recursive feature elimination scheme (RFE), can lead to significant improvement in terms of clustering accuracy, over several state-of-the-art supervised and unsupervised algorithms, with a very limited subset of features. The method shows promise to deal with very large domains. All results, datasets and algorithms are available on line.
Type de document :
Article dans une revue
Machine Learning, 2015, 98 (1-2), pp.157-180. 〈10.1007/s10994-013-5337-8〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01339161
Contributeur : Équipe Gestionnaire Des Publications Si Liris <>
Soumis le : mercredi 29 juin 2016 - 15:47:18
Dernière modification le : jeudi 19 avril 2018 - 14:38:06

Identifiants

Citation

Haytham Elghazel, Alex Aussem. Unsupervised Feature Selection with Ensemble Learning. Machine Learning, 2015, 98 (1-2), pp.157-180. 〈10.1007/s10994-013-5337-8〉. 〈hal-01339161〉

Partager

Métriques

Consultations de la notice

104