The Weighted Factors Automaton : A Tool for DNA Sequences Analysis - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue International Journal of Computer Applications Année : 2013

The Weighted Factors Automaton : A Tool for DNA Sequences Analysis

Résumé

A lot of computing tools are often used for analyzing DNA se- quences like trees, automata, dictionaries, every one being re- served for a particular problem. A. Blumer and al. have proposed a more general computing tool : the smaller automaton recogniz- ing the subwords of a text (DAWG). In this paper we propose the concept of "weighted factors au- tomaton" producing every occurrence of any factor. Its transi- tions are labeled by the read letter and weighted by the set of the indices of the factors beginnings. The factors are obtained by concatenating the read letters and the indices of the factors begin- nings are obtained by computing the intersection of the weight- ing sets, when advancing from the initial state to a final state. We think that this automaton can be more easily processed than DAWG and we present a comparison between DAWG and our automaton: the set of the factors beginnings indices and the fac- tors frequency are more easily obtained by our automaton and the restriction of our automaton to the factors of length k maintains the automaton structure, when DAWG cannot be eas- ily restricted. The applications are numerous: By selecting factors of length 1 , we obtain the coding regions, factors of length 3 , we obtain the expression level of some gene. The "weighted factors au- tomaton" allows us to find matches of pattern, to study homol- ogy, FASTA and BLAST algorithms being significantly simplified.
Fichier principal
Vignette du fichier
hespel-IJCA2013.pdf (227.72 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00880081 , version 1 (05-11-2013)

Identifiants

  • HAL Id : hal-00880081 , version 1

Citer

Christiane Hespel, Farida Benmakrouha, Danielle Quichaud. The Weighted Factors Automaton : A Tool for DNA Sequences Analysis. International Journal of Computer Applications, 2013, 70 (4), pp.0975 - 8887. ⟨hal-00880081⟩
366 Consultations
182 Téléchargements

Partager

Gmail Facebook X LinkedIn More