Using cascading Bloom filters to improve the memory usage for de Brujin graphs

Abstract : De Brujin graphs are widely used in bioinformatics for processing next-generation sequencing data. Due to a very large size of NGS datasets, it is essential to represent de Bruijn graphs compactly, and several approaches to this problem have been proposed recently. In this work, we show how to reduce the memory required by the algorithm of [3] that represents de Brujin graphs using Bloom filters. Our method requires 30% to 40% less memory with respect to the method of [3], with insignificant impact to construction time. At the same time, our experiments showed a better query time compared to [3]. This is, to our knowledge, the best practical representation for de Bruijn graphs.
Document type :
Conference papers
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-00824697
Contributor : Gregory Kucherov <>
Submitted on : Wednesday, May 22, 2013 - 12:50:08 PM
Last modification on : Thursday, March 21, 2019 - 2:51:28 PM

Links full text

Identifiers

  • HAL Id : hal-00824697, version 1
  • ARXIV : 1302.7278

Citation

Kamil Salikhov, Gustavo Sacomoto, Gregory Kucherov. Using cascading Bloom filters to improve the memory usage for de Brujin graphs. Workshop on Algorithms in Bioinformatics, Sep 2013, Sophia Antipolis, France. 13p. ⟨hal-00824697⟩

Share

Metrics

Record views

417