Joint String Complexity for Markov Sources: Small Data Matters * - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Discrete Mathematics and Theoretical Computer Science Année : 2020

Joint String Complexity for Markov Sources: Small Data Matters *

Philippe Jacquet
Dimitris Milioris
  • Fonction : Auteur
  • PersonId : 1090115

Résumé

String complexity is defined as the cardinality of a set of all distinct words (factors) of a given string. For two strings, we introduce the joint string complexity as the cardinality of a set of words that are common to both strings. String complexity finds a number of applications from capturing the richness of a language to finding similarities between two genome sequences. In this paper we analyze the joint string complexity when both strings are generated by Markov sources. We prove that the joint string complexity grows linearly (in terms of the string lengths) when both sources are statistically indistinguishable and sublinearly when sources are statistically distinguishable. Precise analysis of the joint string complexity requires subtle singularity analysis and saddle point method over infinity many saddle points leading to novel oscillatory phenomena with single and double periodicities. To overcome these challenges, we apply analytic techniques such as multivariate generating functions, multivariate depoissonization and Mellin transform, spectral matrix analysis, and complex asymptotic methods.
Fichier principal
Vignette du fichier
small_dat5a.pdf (484.42 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03129901 , version 1 (03-02-2021)

Identifiants

  • HAL Id : hal-03129901 , version 1

Citer

Philippe Jacquet, Dimitris Milioris, Wojciech Szpankowski. Joint String Complexity for Markov Sources: Small Data Matters *. Discrete Mathematics and Theoretical Computer Science, 2020. ⟨hal-03129901⟩
57 Consultations
67 Téléchargements

Partager

Gmail Facebook X LinkedIn More