GATB: Toolbox for developing efficient NGS software

Erwan Drezen 1 G Rizk 1 R Chikhi 2 Charles Deltel 1 C Lemaitre 1 P Peterlongo 1 D Lavenier 1
1 GenScale - Scalable, Optimized and Parallel Algorithms for Genomics
IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE, Inria Rennes – Bretagne Atlantique
Abstract : The analysis of NGS data remains a time and space-consuming task. Many efforts have been made to provide efficient data structures for indexing the terabytes of data generated by the fast sequencing machines (Suffix Array, Burrows-Wheeler transform, Bloom Filter, etc.). Mapper tools, genome assemblers, SNP callers, etc., make an intensive use of these data structures to keep their memory footprint as lower as possible. The overall efficiency of NGS software is brought by a smart combination of how data are represented inside the computer memory and how they are processed through the available processing units inside a processor. Developing such software is thus a real challenge, as it requires a large spectrum of competences from high-level data structure and algorithm concepts to tiny details of implementation. The GATB software toolbox aims to lighten the design of NGS algorithms. It offers a panel of high-level optimized building blocks to speed-up the development of NGS tools related to genome assembly and/or genome analysis. The underlying data structure is the de Bruijn graph, and the general parallelism model is multithreading. The GATB library targets standard computing resources such as current multicore processor (laptop computer, small server) with a few GB of memory. From high-level C++ API, NGS programing designers can rapidly elaborate their own software based on state-of-the-art algorithms and data structures of the domain.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01088828
Contributor : Dominique Lavenier <>
Submitted on : Thursday, December 4, 2014 - 11:10:55 AM
Last modification on : Thursday, August 22, 2019 - 12:04:02 PM
Long-term archiving on : Monday, March 9, 2015 - 5:55:02 AM

File

BSB_Poster.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01088828, version 1

Citation

Erwan Drezen, G Rizk, R Chikhi, Charles Deltel, C Lemaitre, et al.. GATB: Toolbox for developing efficient NGS software. 9th Brazilian Symposium on Bioinformatics, BSB 2014, Oct 2014, Belo Honrizonte, Brazil. ⟨hal-01088828⟩

Share

Metrics

Record views

587

Files downloads

91