Genomes containing Duplicates are Hard to compare

Cedric Chauve; Guillaume Fertin; Romeo Rizzi; Stéphane Vialette

Communication Dans Un Congrès Année : 2006

Genomes containing Duplicates are Hard to compare

(1) , (2) , (3) , (4)

1
2
3
4

Cedric Chauve

Fonction : Auteur
PersonId : 846009
ORCID : 0000-0001-9837-1878

Department of Mathematics [Burnaby]

Guillaume Fertin

Fonction : Auteur correspondant
PersonId : 11485
IdHAL : guillaume-fertin
ORCID : 0000-0002-8251-2012
IdRef : 095050612

Connectez-vous pour contacter l'auteur

Laboratoire d'Informatique de Nantes Atlantique

Romeo Rizzi

Fonction : Auteur
PersonId : 761704
ORCID : 0000-0002-2387-0952

Dipartimento di Matematica e Informatica - Universita Udine

Stéphane Vialette

Fonction : Auteur
PersonId : 3062
IdHAL : stephane-vialette
ORCID : 0000-0003-2308-6970
IdRef : 061620734

Algorithmics

Résumé

In this paper, we are interested in the algorithmic complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes. In that case, there are usually two main ways to compute a given (dis)similarity measure M between two genomes G1 and G2: the rst model, that we will call the matching model, consists in making a one-to-one correspondence between genes of G1 and genes of G2, in such a way that M is optimized. The second model, called the exemplar model, consists in keeping in G1 (resp. G2) exactly one copy of each gene, thus deleting all the other copies, in such a way that M is optimized. We present here dierent results concerning the algorithmic complexity of computing three dierent similarity measures (number of common intervals, MAD number and SAD number) in those two models, basically showing that the problem becomes NP-complete for each of them as soon as genomes contain duplicates. We show indeed that for common intervals, MAD and SAD, the problem is NP-complete when genes are duplicated in genomes, in both the exemplar and matching models. In the case of MAD and SAD, we actually prove that, under both models, both MAD and SAD problems are APX-hard

Domaines

Bio-informatique [q-bio.QM] Bio-Informatique, Biologie Systémique [q-bio.QM] Complexité [cs.CC] Algorithme et structure de données [cs.DS]

Fichier principal

DuplicatesIWBRA06.pdf (113.48 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Guillaume Fertin : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00418260

Soumis le : jeudi 17 septembre 2009-16:41:28

Dernière modification le : vendredi 19 avril 2024-16:18:58

Archivage à long terme le : mardi 15 juin 2010-23:51:16

Dates et versions

hal-00418260 , version 1 (17-09-2009)

Identifiants

HAL Id : hal-00418260 , version 1

Citer

Cedric Chauve, Guillaume Fertin, Romeo Rizzi, Stéphane Vialette. Genomes containing Duplicates are Hard to compare. International Workshop on Bioinformatics Research and Applications (IWBRA 2006), 2006, Reading, United Kingdom. pp.783-790. ⟨hal-00418260⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-NANTES ENS-PARIS EC-PARIS CNRS INRIA LINA LINA-COMBI UMR8623 PSL LS2N UNIV-PARIS-SACLAY NANTES-UNIVERSITE

224 Consultations

203 Téléchargements

Genomes containing Duplicates are Hard to compare

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager