Exploration of de Bruijn Graph Filtering for de novo Assembly Using GraphLab - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Exploration of de Bruijn Graph Filtering for de novo Assembly Using GraphLab

Résumé

The emergence of next generation DNA sequencers has raised interest in short read de novo assembly of whole genomes. Though numerous frameworks were developed in the held, the presence of errors in reads as well as the increasing size of datasets call for scalable preprocessing methods for noise hltering. In this paper we present a hltering algorithm that targets determination of valid k-mers in a de Bruijn graph built from short reads. Such preprocessing will help increase accuracy and reduce memory footprint in further assembly procedures by removing erroneous k-mers from the datasets at an early stage. The algorithm leverages GraphLab, a scalable graph processing framework not previously used in traditional assembly toolchains. The accuracy of the algorithm was evaluated with synthetic datasets exhibiting various error rates and proven to be able to determine large parts of de Bruijn graphs on datasets with error level greater than real-life datasets. The implementation is executed on a distributed cluster and a study of its scalability and operating performances is conducted and exhibits interesting scaling properties, hence demonstrating the relevance of GraphLab in such a context.
Fichier non déposé

Dates et versions

hal-01679193 , version 1 (09-01-2018)

Identifiants

Citer

Julien Collet, Tanguy Sassolas, Yves Lhuillier, Renaud Sirdey, Jacques Carlier. Exploration of de Bruijn Graph Filtering for de novo Assembly Using GraphLab. 31st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2017), May 2017, Orlando, United States. pp.530-539, ⟨10.1109/IPDPSW.2017.102⟩. ⟨hal-01679193⟩
60 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More