Parallelising the Computation of Minimal Absent Words

Abstract : An absent word of a word y of length n is a word that does not occur in y. It is a minimal absent word if all its proper factors occur in y. Minimal absent words have been computed in genomes of organisms from all domains of life; their computation also provides a fast alternative for measuring approximation in sequence comparison. There exists an O(n)-time and O(n)-space algorithm for computing all minimal absent words on a fixed-sized alphabet based on the construction of suffix array (Barton et al., 2014). An implementation of this algorithm was also provided by the authors and is currently the fastest available. In this article, we present a new O(n)-time and O(n)-space algorithm for computing all minimal absent words; it has the desirable property that, given the indexing data structure at hand, the computation of minimal absent words can be executed in parallel. Experimental results show that a mul-tiprocessing implementation of this algorithm can accelerate the overall computation by more than a factor of two compared to state-of-the-art approaches. By excluding the indexing data structure construction time, we show that the implementation achieves near-optimal speed-ups.
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01255489
Contributor : Alice Heliou <>
Submitted on : Wednesday, January 13, 2016 - 5:39:15 PM
Last modification on : Friday, May 10, 2019 - 8:22:07 AM
Long-term archiving on : Friday, November 11, 2016 - 4:50:22 AM

File

CP55.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Carl Barton, Alice Héliou, Laurent Mouchard, Solon P. Pissis. Parallelising the Computation of Minimal Absent Words. PPAM 2015, Sep 2015, Cracovie, Poland. ⟨10.1007/978-3-319-32152-3_23⟩. ⟨hal-01255489⟩

Share

Metrics

Record views

1731

Files downloads

217