Fast estimation of posterior probabilities in change-point models through a constrained hidden Markov model

The Minh Luong; Yves Rozenholc; Gregory Nuel

Communication Dans Un Congrès Année : 2012

Fast estimation of posterior probabilities in change-point models through a constrained hidden Markov model

, (1) , (1)

The Minh Luong

Fonction : Auteur

Yves Rozenholc

Fonction : Auteur
PersonId : 2258
IdHAL : yvesrozenholc

Mathématiques Appliquées Paris 5

Gregory Nuel

Fonction : Auteur
PersonId : 969781
IdHAL : gregory-nuel
ORCID : 0000-0001-9910-2354
IdRef : 117402117

Mathématiques Appliquées Paris 5

Résumé

The detection of change-points in heterogeneous sequences is a statistical challenge with applications across a wide variety of fields. In bioinformatics, a vast amount of methodology has been developed to identify an ideal set of change-points for detecting Copy Number Variation (CNV). Numerous efficient algorithms are currently available for finding the best segmentation of the data in CNV. However, relatively few approaches consider the important problem of assessing the uncertainty of the change-point location. Having quadratic complexity, these approaches typically are intractable for large datasets of tens of thousands points or more. In this paper, we assess uncertainty through a constrained hidden Markov model with a fixed number of segments of contiguous observations with the same distribution. Forward-backward algorithms from this model estimate posterior probabilities of interest with linear complexity. The methods are implemented in the R package postCP, which uses the results of a given change-point detection algorithm to estimate the probability that each observation is a change-point. We present the results of the package on a sequence of Poisson generated data, and on a publicly available smaller data set (n=120) for CNV. Due to its frequentist framework, postCP obtains less conservative confidence intervals than previously published Bayesian methods, but with linear complexity instead of quadratic. On another data set of high-resolution data (n=14,241), the implementation processed high-resolution data in less than one second on a mid-range laptop computer.

Domaines

Applications [stat.AP]

Yves Rozenholc : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00712343

Soumis le : mercredi 27 juin 2012-00:33:53

Dernière modification le : jeudi 11 avril 2024-13:16:13

Dates et versions

hal-00712343 , version 1 (27-06-2012)

Identifiants

HAL Id : hal-00712343 , version 1

Citer

The Minh Luong, Yves Rozenholc, Gregory Nuel. Fast estimation of posterior probabilities in change-point models through a constrained hidden Markov model. JOBIM: Journée Ouverte en Biologie, Informatique et Mathématiques, Jul 2012, Rennes, France. ⟨hal-00712343⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS MAP5 UP-SCIENCES

92 Consultations

0 Téléchargements

Fast estimation of posterior probabilities in change-point models through a constrained hidden Markov model

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager