Exact distribution of word occurrences in a random sequence of letters - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of Applied Probability Année : 1999

Exact distribution of word occurrences in a random sequence of letters

Résumé

The study of the distribution of the distance between words in a random sequence of letters is interesting in view of application in genome sequence analysis. In this paper we give the exact distribution probability and cumulative distribution function of the distances between two successive occurrences of a given word and between the nth and the (n + m)th occurrences under three models of generation of the letters: i.i.d, with the same probability for each letter, i.i.d. with different probabilities and Markov process. The generating function and the first two moments are also given. The point of studying the distances instead of the counting process is that we get some knowledge not only about the frequency of a word but also about its longitudinal distribution in the sequence.
Fichier non déposé

Dates et versions

hal-01222427 , version 1 (29-10-2015)

Identifiants

  • HAL Id : hal-01222427 , version 1
  • PRODINRA : 326429
  • WOS : 000080676100016

Citer

Stephane Robin, Jean-Jacques Daudin. Exact distribution of word occurrences in a random sequence of letters. Journal of Applied Probability, 1999, 36 (1), pp.179-193. ⟨hal-01222427⟩
59 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More