| HAL: hal-00343945, version 2 |
| arXiv: 0812.1388 |
| Detailed view | Export this paper |
|
|
| Available versions: | v1 (2008-12-07) | v2 (2008-12-30) |
|
|
|
|
| Model-Based Clustering using multi-allelic loci data with loci selection |
|
|
Wilson Toussile 1Elisabeth Gassiat 1 |
|
|
| (2008-11-30) |
|
|
| We propose a Model-Based Clustering (MBC) method combined with loci selection using multi-allelic loci genetic data. The loci selection problem is regarded as a model selection problem and models in competition are compared with the Bayesian Information Criterion (BIC). The resulting procedure selects the subset of clustering loci, the number of clusters, estimates the proportion of each cluster and the allelic frequencies within each cluster. We prove that the selected model converges in probability to the true model under a single realistic assumption as the size of the sample tends to infinity. The proposed method named MixMoGenD (Mixture Model using Genetic Data) was implemented using c++ programming language. Numerical experiments on simulated data sets was conducted to highlight the interest of the proposed loci selection procedure. |
|
|
|
|
|
|
|
|
|
|
| 1: | Laboratoire de Mathématiques d'Orsay (LM-Orsay) |
| CNRS : UMR8628 – Université Paris XI - Paris Sud | |
|
|
|
|
|
|
|
|
| Subject | : | Statistics/Applications Mathematics/Statistics Statistics/Statistics Theory |
|
|
| Model-Based Clustering – Model Selection – Variable Selection – Bayesian Information Criterion – Population Genetics |
|
|
| Attached file list to this document: | ||||||||||
|
|
|
| hal-00343945, version 2 | |
| http://hal.archives-ouvertes.fr/hal-00343945 | |
| oai:hal.archives-ouvertes.fr:hal-00343945 | |
| From: Wilson Toussile | |
| Submitted on: Tuesday, 30 December 2008 12:03:21 | |
| Updated on: Tuesday, 30 December 2008 18:08:25 | |