Analyse computationnelle des éléments cis-régulateurs dans les génomes des drosophiles et des mammifères

Abstract : Cellular differentiation and tissue specification depend in part on the establishment of specific transcriptional programs of gene expression. These programs result from the interpretation of genomic regulatory information by sequence-specific transcription factors (TFs). Decoding this information in sequenced genomes is a key issue. In a first part, we study the interaction between the TFs and the DNA sequences they bind to, called Transcription Factor Binding Sites (TFBSs). Using a Potts model inspired from spin glass physics along with high-throughput binding data for a variety of Drosophilae and mammalian TFs, we show that TFBSs exhibit correlations among nucleotides and that the account of their contribution in the binding energy greatly improves the predictability of genomic TFBSs. Then, we present Imogene, an extension to mammalian genomes of a Bayesian, phylogeny-based algorithm designed to computationally identify the Cis-Regulatory Modules (CRMs) that control gene expression in a set of co-regulated genes, and that was previously applied to Drosophila regulation. Starting with a small number of CRMs in a reference species as a training set, but with no a priori knowledge of the factors acting in trans, the algorithm uses the over-representation and conservation of TFBSs among related species to predict putative regulatory elements along with genomic CRMs underlying co-regulation. We present several applications of this algorithm both in Drosophila and vertebrates. We also present an extension of the algorithm to the case of pattern recognition, showing that CRMs with different patterns of expression can be distinguished on the sole basis of their DNA motifs content. Finally, we present applications of these modeling tools to real biological cases : the trichomes differentiation in Drosophila, and the skeletal muscle differentiation in the mouse. In both cases, predictions were experimentally validated in a joint work with biological teams, and point towards a great flexibility of the cis-regulatory processes.
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-00865159
Contributor : Marc Santolini <>
Submitted on : Tuesday, September 24, 2013 - 9:20:39 AM
Last modification on : Sunday, March 31, 2019 - 1:22:35 AM

Identifiers

  • HAL Id : tel-00865159, version 1

Citation

Marc Santolini. Analyse computationnelle des éléments cis-régulateurs dans les génomes des drosophiles et des mammifères. Analyse de données, Statistiques et Probabilités [physics.data-an]. Université Paris-Diderot - Paris VII, 2013. Français. ⟨tel-00865159⟩

Share

Metrics

Record views

543

Files downloads

253