SpecTrees: an efficient without a priori data structure for MS/MS spectra identification

Abstract : Tandem Mass Spectrometry (or MS/MS) is the most com- mon strategy used to identify unknown proteins present in a mixture. It generates thousands of MS/MS spectra per sample, each one having to be compared to a large reference database from which artificial spectra are produced. The goal is to map each experimental spectrum to an ar- tificial one, so as to identify the proteins they come from. However, this comparison step is highly time consuming. Thus, in order to reduce com- putation time, most methods filter a priori the reference database. This tends to discard potential candidates and leads to frequent errors and lacks of identifications. We have developed an original alternate method, efficient both in terms of memory and computation time, that allows to pairwise compare spectra without any a priori filtering. The core of our method is SpecTrees, a data structure designed towards this goal, that stores all the input spectra without any filtering. It is designed to be easy to implement, and is also highly scalable and incremental. Once Spec- Trees is built, one can run its own identification process by extracting from SpecTrees any information of interest, including pairwise spectra comparison. In this paper, we first present SpecTrees, its main fea- tures and how to implement it. We then experiment our method on two sets of experimental spectra from the ISB standard 18 proteins mixture, thereby showing its rapidity and its ability to make identifications that other software do not reach.
Type de document :
Communication dans un congrès
Martin Frith and Christian N. S. Pedersen. 16th Workshop on Algorithms in Bioinformatics (WABI 2016), Aug 2016, Aarhus, Denmark. Springer-Verlag, Lecture Notes in Bioinformatics. 〈http://conferences.au.dk/algo16/wabi/〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01337288
Contributeur : Guillaume Fertin <>
Soumis le : vendredi 24 juin 2016 - 18:34:43
Dernière modification le : jeudi 5 avril 2018 - 10:36:49

Identifiants

  • HAL Id : hal-01337288, version 1

Collections

Citation

Matthieu David, Guillaume Fertin, Dominique Tessier. SpecTrees: an efficient without a priori data structure for MS/MS spectra identification. Martin Frith and Christian N. S. Pedersen. 16th Workshop on Algorithms in Bioinformatics (WABI 2016), Aug 2016, Aarhus, Denmark. Springer-Verlag, Lecture Notes in Bioinformatics. 〈http://conferences.au.dk/algo16/wabi/〉. 〈hal-01337288〉

Partager

Métriques

Consultations de la notice

483