Skip to Main content Skip to Navigation
Book sections

Forests of latent tree models to decipher genotype-phenotype associations

Abstract : Genome-wide association studies have revolutionized the search for genetic influences on common genetic diseases such as diabetes, obesity, asthma, cardio-vascular diseases and some cancers. In particular, together with the population aging concern, increasing health care costs require that further investigations are pursued to design scalable and efficient tools. The high dimensionality and complexity of genetic data hinder the detection of genetic associations. To decrease the risks of missing the causal factor and discovering spurious associations, machine learning offers an attractive framework alternative to classical statistical approaches. A novel class of probabilistic graphical models (PGMs) has recently been proposed - the forest of latent tree models (FLTMs) - , to reach a trade-off between faithful modeling of data dependences and tractability. In this chapter, we assess the great potentiality of this model to detect genotype-phenotype associations. The FTLM-based contribution is first put into the perspective of PGM-based works meant to model the dependences in genetic data; then the contribution is considered from the technical viewpoint of LTM learning, with the vital objective of scalability in mind.We then present the systematic and comprehensive evaluation conducted to assess the ability of the FLTM model to detect genetic associations through latent variables. Realistic simulations were performed under various controlled conditions. In this context, we present a proceduretailored to correct for multiple testing. We also show and discuss resultsobtained on real data. Beside guaranteeing data dimension reduction through latent variables, the FLTM model is empirically proven able to capture indirect genetic associations with the disease: strong associations are evidenced between the disease and the ancestor nodes of the causal genetic marker node, in the forest; in contrast, very weak associations are obtained for other latent variables. Finally, we discuss the prospects of the model for association detection at genome scale.
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-00915532
Contributor : Christine Sinoquet <>
Submitted on : Sunday, December 8, 2013 - 11:22:46 PM
Last modification on : Thursday, January 17, 2019 - 10:40:04 AM

Identifiers

Collections

Citation

Christine Sinoquet, Raphaël Mourad, Philippe Leray. Forests of latent tree models to decipher genotype-phenotype associations. J. Gariel, J. Schier, S. Van Huffel, E. Conchon, C. Correia, A. Fred and H. Gamboa. Biomedical Engineering Systems and Technologies, Communication in Computer and Information Science 357, Springer Berlin Heidelberg, pp.113-134, 2013, 978-3-642-38255-0. ⟨10.1007/978-3-642-38256-7_8⟩. ⟨hal-00915532⟩

Share

Metrics

Record views

179