A Hybrid Data Mining Approach for the Identification of Biomarkers in Metabolomic Data

Dhouha Grissa 1, 2 Blandine Comte 1 Estelle Pujos-Guillot 1 Amedeo Napoli 2
2 ORPAILLEUR - Knowledge representation, reasonning
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : In this paper, we introduce an approach for analyzing complex biological data obtained from metabolomic analytical platforms. Such platforms generate massive and complex data that need appropriate methods for discovering meaningful biological information. The datasets to analyze consist in a limited set of individuals and a large set of attributes (variables). In this study, we are interested in mining metabolomic data to identify predictive biomarkers of metabolic diseases, such as type 2 diabetes. Our experiments show that a combination of numerical methods, e.g. SVM, Random Forests (RF), and ANOVA, with a symbolic method such as FCA, can be successfully used for discovering the best combination of predictive features. Our results show that RF and ANOVA seem to be the best suited methods for feature selection and discovery. We then use FCA for visualizing the markers in a suggestive and interpretable concept lattice. The outputs of our experiments consist in a short list of the 10 best potential predictive biomarkers.
Document type :
Conference papers
Complete list of metadatas

Contributor : Dhouha Grissa <>
Submitted on : Wednesday, December 21, 2016 - 2:03:22 PM
Last modification on : Tuesday, December 18, 2018 - 4:38:02 PM
Long-term archiving on : Monday, March 20, 2017 - 4:26:50 PM


Files produced by the author(s)


  • HAL Id : hal-01421015, version 1


Dhouha Grissa, Blandine Comte, Estelle Pujos-Guillot, Amedeo Napoli. A Hybrid Data Mining Approach for the Identification of Biomarkers in Metabolomic Data. Concept Lattices and Their Applications, Jul 2016, Moscou, Russia. ⟨hal-01421015⟩



Record views


Files downloads