Multiclass classification and gene selection with a stochastic algorithm - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2008

Multiclass classification and gene selection with a stochastic algorithm

Résumé

Microarray technology allows for the monitoring of thousands of gene expressions in various biological conditions, but most of these genes are irrelevant for classifying these conditions. Feature selection is consequently needed to help reduce the dimension of the variable space. Starting from the application of the stochastic meta algorithm ``Optimal Feature Weighting" (OFW) for selecting features in various classification problems, focus is made on the multiclass problem that wrapper methods rarely handle. From a computational point of view, one of the main difficulties comes from the commonly unbalanced classes situation when dealing with microarray data. From a theoretical point of view, very few methods have been developed to minimize any classification criterion, compared to the 2-class situation (e.g. SVM, lo SVM, RFE...). The OFW approach is developed to handle multiclass problems using CART and \textit{one-vs-one} SVM as classifiers. The results are then compared with those obtained with other multiclass selection algorithm (Random Forests and the filter method F-test), on five public microarray data sets with various complexities. Statistical relevancy of the results is assessed by measuring and comparing the performances of these different approaches. The aim of this study is to heuristically evaluate which method would be the best to select genes classifying the minority classes. Application and biological interpretation are then given in the case of a pig folliculogenesis study.
Fichier principal
Vignette du fichier
ReSubmit.pdf (279.24 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00323848 , version 1 (23-09-2008)

Identifiants

  • HAL Id : hal-00323848 , version 1

Citer

Kim-Anh Lê Cao, Agnès Bonnet, Sébastien Gadat. Multiclass classification and gene selection with a stochastic algorithm. 2008. ⟨hal-00323848⟩
267 Consultations
260 Téléchargements

Partager

Gmail Facebook X LinkedIn More