Skip to Main content Skip to Navigation
Conference papers

Utilization of Machine-Learning Methodologies in Order to Understand Complex Evolutionary and Functional Links among Bacterial Genomes

Olivier Poirion 1 Bénédicte Lafay 1, *
* Corresponding author
Abstract : We are searching for evolutionary trends among genome maintenance-related genes present on the replicon sets (i.e., chromosomes and plasmids) of bacterial genomes. Traditional bioinformatic and phylogenetic methods are not adapted to large scale and high-dimensional study. We thus developed a semi-supervised analytical pipeline relying on data-mining methodologies. Generic unsupervised (SOM, K-means, Bayesian networks) and supervised (SVM,decision trees, boosting) classification methods were combined with specific bioinformatic algorithms based on sequence homology search (BLAST). Through this approach, important evolutionary processes could be characterized among genome-integrated plasmids and chromosomes. We here report on the inherent difficulties (input data bias, high-dimensional analysis, noise) and the applied methodology, and conclude on the significance of the data-mining methodology in knowledge discovery.
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-00999074
Contributor : Bénédicte Lafay <>
Submitted on : Tuesday, June 3, 2014 - 11:12:19 AM
Last modification on : Monday, September 13, 2021 - 2:44:03 PM
Long-term archiving on: : Wednesday, September 3, 2014 - 11:07:24 AM

File

IFCS_2013-Poirion_Lafay.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00999074, version 1

Citation

Olivier Poirion, Bénédicte Lafay. Utilization of Machine-Learning Methodologies in Order to Understand Complex Evolutionary and Functional Links among Bacterial Genomes. 2013 conference of the International Federation of Classification Societies (IFCS) : 'United through Ordination and Classifi- cation', Jul 2013, Tilburg, Netherlands. p. 203. ⟨hal-00999074⟩

Share

Metrics

Record views

660

Files downloads

139