Ordered Index Seed Algorithm for Intensive DNA Sequence Comparison

Dominique Lavenier 1
1 SYMBIOSE - Biological systems and models, bioinformatics and sequences
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This paper presents a seed-based algorithm for intensive DNA sequence comparison. The novelty comes from the way seeds are used to efficiently generate small ungapped alignments -- or HSPs (High Scoring Pairs) - in the first stage of the search. W-nt words are first indexed and all the 4^W possible seeds are enumerated following a strict order ensuring fast generation of unique HSPs. A prototype - written in C - has been realized and tested on large DNA banks. Speed-up compared to BLASTN range from 5 to 28 with comparable sensitivity.
Complete list of metadatas

Cited literature [15 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00322696
Contributor : Dominique Lavenier <>
Submitted on : Thursday, September 18, 2008 - 2:24:38 PM
Last modification on : Friday, November 16, 2018 - 1:24:01 AM
Long-term archiving on : Friday, June 4, 2010 - 11:32:55 AM

File

lavenier_1569087275.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00322696, version 1

Citation

Dominique Lavenier. Ordered Index Seed Algorithm for Intensive DNA Sequence Comparison. HiCOMB 2008 : Seventh IEEE International Workshop on High Performance Computational Biology, Apr 2008, Miami, United States. online proceeding : http://www.hicomb.org/HiCOMB2008/. ⟨hal-00322696⟩

Share

Metrics

Record views

330

Files downloads

213