Pattern discovery allowing gaps, substitution matrices and multiple score functions - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2003

Pattern discovery allowing gaps, substitution matrices and multiple score functions

Résumé

Pattern discovery has many applications in finding functio­nally or structurally important regions in biological sequences (binding sites, regulatory sites, protein signatures etc.). In this paper we present a new pattern discovery algorithm, which has the following features: - it allows to find, in exactly the same manner and without any prior specification, patterns with fixed length gaps (i.e. sequences of one or several consecutive wild-cards) and contiguous patterns; - it allows the use of any pairwise score function, thus offering multiple ways to define or to constrain the type of the searched patterns; in particular, one can use substitution matrices (PAM, BLOSUM) to compare amino acids, or exact matchings to compare nucleotides, or equivalency sets in both cases. We describe the algorithm, compare it to other algorithms and give the results of the tests on discovering binding sites for DNA-binding proteins (ArgR, LexA, PurR, TyrR respectively) in E. coli, and promoter sites in a set of Dicot plants.
Fichier principal
Vignette du fichier
PatternW.pdf (199.67 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00487245 , version 1 (28-05-2010)

Identifiants

  • HAL Id : hal-00487245 , version 1

Citer

Alban Mancheron, Irena Rusu. Pattern discovery allowing gaps, substitution matrices and multiple score functions. Workshop on Algorithms in BioInformatics (WABI), Sep 2003, Budapest, Hungary. pp.129-145. ⟨hal-00487245⟩
139 Consultations
136 Téléchargements

Partager

Gmail Facebook X LinkedIn More