Efficient semi-supervised feature selection by an ensemble approach

Mohammed Hindawi 1 Haytham Elghazel 1 Khalid Benabdeslem 1
1 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : Constrained Laplacian Score (CLS) is a recently proposed method for semi-supervised feature selection. It presented an outperforming performance comparing to other methods in the state of the art. This is because CLS exploits both unsupervised and supervised parts of data for selecting the most relevant features. However, the choice of the little supervision information (represented by pairwise constraints) is still a critical issue. In fact, constraints are proven to have some noise which may deteriorate the learning performance. In this paper we try to override any negative e ects of constraints set by the variation of their sources. This is done by an ensemble technique using both a resampling of data (bagging) and a random subspace strategy. The proposed approach generates a global ranking of features by aggregating multiple Constraint Laplacian Scores on di erent views of the available labeled and unlabeled data . We validate our approach by empirical experiments over high-dimensional datasets and compare it with other representative methods.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01339249
Contributor : Équipe Gestionnaire Des Publications Si Liris <>
Submitted on : Wednesday, June 29, 2016 - 3:50:16 PM
Last modification on : Thursday, November 21, 2019 - 2:21:42 AM

Identifiers

  • HAL Id : hal-01339249, version 1

Citation

Mohammed Hindawi, Haytham Elghazel, Khalid Benabdeslem. Efficient semi-supervised feature selection by an ensemble approach. International Workshop on Complex Machine Learning Problems with Ensemble Methods COPEM@ECML/PKDD'13, Sep 2013, Prague, Czech Republic. pp.41-55. ⟨hal-01339249⟩

Share

Metrics

Record views

83