Skip to Main content Skip to Navigation
Conference papers

Modeling of genotype data with forests of latent trees to detect genetic causes of diseases

Abstract : Together with the population aging concern, increasing health care costs require understanding the causal basis for common genetic diseases. The high dimensionality and complexity of genetic data hamper the detection of genetic causal factors. Machine learning offers an appealing alternative framework to standard statistical approaches. A novel class of probabilistic graphical models, the forest of latent tree models, has been proposed to obtain a trade-off between faithful modeling of data dependences and tractability. The work reported here evaluates the soundness of our model in the context of association genetics. We have performed intensive tests, in various controlled conditions, on realistic simulated data. Beside guaranteeing data dimension reduction through latent variables, the model is empirically proven able to capture indirect genetic associations with the disease. Strong associations are evidenced between the disease and the ancestor nodes of the causal genetic marker node, in the model. In contrast, very weak associations are obtained for other nodes.
Complete list of metadata
Contributor : Christine Sinoquet <>
Submitted on : Monday, December 9, 2013 - 12:16:32 AM
Last modification on : Thursday, January 17, 2019 - 10:40:04 AM


  • HAL Id : hal-00915538, version 1



Christine Sinoquet, Raphaël Mourad, Philippe Leray. Modeling of genotype data with forests of latent trees to detect genetic causes of diseases. Ado2013 (Machine Learning and Omics Data), Dec 2013, Lille, France. 6 p. ⟨hal-00915538⟩



Record views