Genomes containing Duplicates are Hard to compare - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2006

Genomes containing Duplicates are Hard to compare

Résumé

In this paper, we are interested in the algorithmic complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes. In that case, there are usually two main ways to compute a given (dis)similarity measure M between two genomes G1 and G2: the rst model, that we will call the matching model, consists in making a one-to-one correspondence between genes of G1 and genes of G2, in such a way that M is optimized. The second model, called the exemplar model, consists in keeping in G1 (resp. G2) exactly one copy of each gene, thus deleting all the other copies, in such a way that M is optimized. We present here dierent results concerning the algorithmic complexity of computing three dierent similarity measures (number of common intervals, MAD number and SAD number) in those two models, basically showing that the problem becomes NP-complete for each of them as soon as genomes contain duplicates. We show indeed that for common intervals, MAD and SAD, the problem is NP-complete when genes are duplicated in genomes, in both the exemplar and matching models. In the case of MAD and SAD, we actually prove that, under both models, both MAD and SAD problems are APX-hard
Fichier principal
Vignette du fichier
DuplicatesIWBRA06.pdf (113.48 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00418260 , version 1 (17-09-2009)

Identifiants

  • HAL Id : hal-00418260 , version 1

Citer

Cedric Chauve, Guillaume Fertin, Romeo Rizzi, Stéphane Vialette. Genomes containing Duplicates are Hard to compare. International Workshop on Bioinformatics Research and Applications (IWBRA 2006), 2006, Reading, United Kingdom. pp.783-790. ⟨hal-00418260⟩
224 Consultations
203 Téléchargements

Partager

Gmail Facebook X LinkedIn More