HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

Fast Inference of Individual Admixture Coefficients Using Geographic Data

Kevin Caye 1 Flora Jay 2, 3, 4 Olivier Michel 5 Olivier François 1
1 TIMC-IMAG-BCM - Biologie Computationnelle et Mathématique
TIMC-IMAG - Techniques de l'Ingénierie Médicale et de la Complexité - Informatique, Mathématiques et Applications, Grenoble - UMR 5525
2 BioInfo - LRI - Bioinformatique (LRI)
LRI - Laboratoire de Recherche en Informatique
4 TAU - TAckling the Underspecified
LRI - Laboratoire de Recherche en Informatique, Inria Saclay - Ile de France
Abstract : Accurately evaluating the distribution of genetic ancestry across geographic space is one of the main questions addressed by evolutionary biologists. This question has been commonly addressed through the application of Bayesian estimation programs allowing their users to estimate individual admixture proportions and allele frequencies among putative ancestral populations. Following the explosion of high-throughput sequencing technologies, several algorithms have been proposed to cope with computational burden generated by the massive data in those studies. In this context, incorporating geographic proximity in ancestry estimation algorithms is an open statistical and computational challenge. In this study, we introduce new algorithms that use geographic information to estimate ancestry proportions and ancestral genotype frequencies from population genetic data. Our algorithms combine matrix factorization methods and spatial statistics to provide estimates of ancestry matrices based on least-squares approximation. We demonstrate the benefit of using spatial algorithms through extensive computer simulations, and we provide an example of application of our new algorithms to a set of spatially referenced samples for the plant species Arabidopsis thaliana. Without loss of statistical accuracy, the new algorithms exhibit runtimes that are much shorter than those observed for previously developed spatial methods. Our algorithms are implemented in the R package, tess3r.
Complete list of metadata

Cited literature [54 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01676712
Contributor : Flora Jay Connect in order to contact the contributor
Submitted on : Saturday, January 6, 2018 - 2:35:13 AM
Last modification on : Friday, February 4, 2022 - 3:22:42 AM
Long-term archiving on: : Saturday, April 7, 2018 - 12:49:47 PM

File

AOAS1610-012R2A0.pdf
Files produced by the author(s)

Identifiers

Citation

Kevin Caye, Flora Jay, Olivier Michel, Olivier François. Fast Inference of Individual Admixture Coefficients Using Geographic Data. Annals of Applied Statistics, Institute of Mathematical Statistics, 2018, 12 (1), pp.586-608. ⟨10.1214/17-AOAS1106⟩. ⟨hal-01676712⟩

Share

Metrics

Record views

1273

Files downloads

178