Alternative Methods for H1 Simulations in Genome-Wide Association Studies

Vittorio Perduca; Christine Sinoquet; Raphaël Mourad; Gregory Nuel

doi:10.1159/000336194

Article Dans Une Revue Human Heredity Année : 2012

Alternative Methods for H1 Simulations in Genome-Wide Association Studies

(1) , (2) , (2) , (1)

1
2

Vittorio Perduca

Fonction : Auteur correspondant
PersonId : 11396
IdHAL : vittorio-perduca
ORCID : 0000-0003-0339-0473

Connectez-vous pour contacter l'auteur

Mathématiques Appliquées Paris 5

Christine Sinoquet

Fonction : Auteur
PersonId : 171562
IdHAL : christine-sinoquet
IdRef : 033239029

Laboratoire d'Informatique de Nantes Atlantique

Raphaël Mourad

Fonction : Auteur
PersonId : 967866
ORCID : 0000-0001-6700-5728

Laboratoire d'Informatique de Nantes Atlantique

Gregory Nuel

Fonction : Auteur
PersonId : 969781
IdHAL : gregory-nuel
ORCID : 0000-0001-9910-2354
IdRef : 117402117

Mathématiques Appliquées Paris 5

Résumé

Objective: Assessing the statistical power to detect susceptibility variants plays a critical role in genome-wide association (GWA) studies both from the prospective and retrospective point of view. Power is empirically estimated by simulating phenotypes under a disease model H1. For this purpose, the gold standard consists in simulating genotypes given the phenotypes (e.g.Hapgen). We introduce here an alternative approach for simulating phenotypes under H1 that does not require generating new genotypes for each simulation. Methods: In order to simulate phenotypes with a fixed total number of cases and under a given disease model, we suggest 3 algorithms: (1) a simple rejection algorithm; (2) a numerical Markov chain Monte-Carlo (MCMC) approach, and (3) an exact and efficient backward sampling algorithm. In our study, we validated the 3 algorithms both on a simulated dataset and by comparing them with Hapgen on a more realistic dataset. For an application, we then conducted a simulation study on a 1000 Genomes Project dataset consisting of 629 individuals (314 cases) and 8,048 SNPs from chromosome X. We arbitrarily defined an additive disease model with two susceptibility SNPs and an epistatic effect. Results: The 3 algorithms are consistent, but backward sampling is dramatically faster than the other two. Our approach also gives consistent results with Hapgen. Using our application data, we showed that our limited design requires a biological a priori to limit the investigated region. We also proved that epistatic effects can play a significant role even when simple marker statistics (e.g. trend) are used. We finally showed that the overall performance of a GWA study strongly depends on the prevalence of the disease: the larger the prevalence, the better the power. Conclusions: Our approach is a valid alternative to Hapgen-type methods; it is not only dramatically faster but has 2 main advantages: (1) there is no need for sophisticated genotype models (e.g. haplotype frequencies, or recombination rates), and (2) the choice of the disease model is completely unconstrained (number of SNPs involved, gene-environment interactions, hybrid genetic models, etc.). Our 3 algorithms are available in an R package called 'waffect' ('double-u affect', for weighted affectations).

Mots clés

ROC, Receiver operating characteristic, Statistical power, Disease model, Rejection, Markov chain Monte Carlo, MCMC, Backward sampling, Area under the curve, AUC

Domaines

Probabilités [math.PR] Bio-Informatique, Biologie Systémique [q-bio.QM] Bio-informatique [q-bio.QM]

Grégory Nuel : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00686364

Soumis le : mardi 10 avril 2012-09:18:24

Dernière modification le : jeudi 11 avril 2024-13:16:13

Dates et versions

hal-00686364 , version 1 (10-04-2012)

Identifiants

HAL Id : hal-00686364 , version 1
ARXIV : 1201.5046
DOI : 10.1159/000336194
PUBMED : 22472690

Citer

Vittorio Perduca, Christine Sinoquet, Raphaël Mourad, Gregory Nuel. Alternative Methods for H1 Simulations in Genome-Wide Association Studies. Human Heredity, 2012, 73 (2), pp.95-104. ⟨10.1159/000336194⟩. ⟨hal-00686364⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-NANTES CNRS LINA LINA-COD MAP5 LINA-DUKE LS2N UP-SCIENCES NANTES-UNIVERSITE

149 Consultations

0 Téléchargements

Alternative Methods for H1 Simulations in Genome-Wide Association Studies

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager