Genome evolution aware gene trees - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2015

Genome evolution aware gene trees

Résumé

A gene family tree is traditionally inferred from a multiple alignment of homologous sequences according to a model of sequence evolution. Trees for several genes families are thus constructed independently from each other. They often carry unresolved parts or bad resolutions. Information for their full resolution may lie in the poorly exploited dependency between gene families, each bringing information for the resolution of the others. We propose to use several kinds of such dependencies in the construction of gene trees: information from a species tree through a model of gene content evolution by duplication, speciation and loss, information from extant synteny through ortholog predictions, and information from ancestral synteny through a model of gene neighborhood evolution. We develop several " correction " techniques, yielding a software package called " RefineTree ". We report some tests on simulated data and an application on the full set of gene families from the Ensembl database. We perform a genome-wide analysis of duplication and loss patterns on the history of 65 eukaryote species, including ancestral genes and gene orders of all ancestors along this phy-logeny. We show that according to several measures including running time, likelihood, stability of genome content and linearity of ancestral chromosomes, trees corrected by RefineTree are arguably more plausible than the ones stored by Ensembl. We finally discuss the quality criteria in the light of gene definition as a sequence or as a locus and extract some cases where a " true " gene tree should depend on this definition. RefineTree web interface is available at: Gene phylogenies on a whole genome scale are virtually useful to reconstruct ancestral genomes, including genes and gene orders, along with patterns of genome evolution. But the current standard gene trees stored in databases do not yet meet the sufficient quality for this aplicability. We propose a method to correct them, extensively test it, and apply it to the reconstruction of 64 eukaryote ancestral genomes (genes and gene orders when the evolutionary signal allows it).
Fichier principal
Vignette du fichier
genetrees_hal.pdf (1.35 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01162963 , version 1 (11-06-2015)
hal-01162963 , version 2 (21-08-2015)
hal-01162963 , version 3 (09-04-2016)
hal-01162963 , version 4 (17-08-2016)

Identifiants

  • HAL Id : hal-01162963 , version 1

Citer

Emmanuel Noutahi, Magali Semeria, Manuel Lafond, Jonathan Seguin, Bastien Boussau, et al.. Genome evolution aware gene trees. 2015. ⟨hal-01162963v1⟩
536 Consultations
566 Téléchargements

Partager

Gmail Facebook X LinkedIn More