Segmentor3IsBack : an R package for the fast and exact segmentation of Seq-data - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Algorithms for Molecular Biology Année : 2014

Segmentor3IsBack : an R package for the fast and exact segmentation of Seq-data

Alice Cleynen
Michel Koskas
  • Fonction : Auteur
  • PersonId : 1202708
Stephane Robin

Résumé

Background: Change point problems arise in many genomic analyses such as the detection of copy number variations or the detection of transcribed regions. The expanding Next Generation Sequencing technologies now allow to locate change points at the nucleotide resolution. Results: Because of its complexity which is almost linear in the sequence length when the maximal number of segments is constant, and as its performance had been acknowledged for microarrays, we propose to use the Pruned Dynamic Programming algorithm for Seq-experiment outputs. This requires the adaptation of the algorithm to the negative binomial distribution with which we model the data. We show that if the dispersion in the signal is known, the PDP algorithm can be used, and we provide an estimator for this dispersion. We describe a compression framework which reduces the time complexity without modifying the accuracy of the segmentation. We propose to estimate the number of segments via a penalized likelihood criterion. We illustrate the performance of the proposed methodology on RNA-Seq data. Conclusions: We illustrate the results of our approach on a real dataset and show its good performance. Our algorithm is available as an R package on the CRAN repository.
Fichier principal
Vignette du fichier
2014_Cleynen_Algorithms for Molecular Biology_1.pdf (1004.33 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01197619 , version 1 (27-05-2020)

Licence

Paternité

Identifiants

Citer

Alice Cleynen, Michel Koskas, Guillem Rigaill, Stephane Robin. Segmentor3IsBack : an R package for the fast and exact segmentation of Seq-data. Algorithms for Molecular Biology, 2014, 9, 11 p. ⟨10.1186/1748-7188-9-6⟩. ⟨hal-01197619⟩
115 Consultations
59 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More