On the Approximability of Comparing Genomes with Duplicates

Abstract : A central problem in comparative genomics consists in computing a (dis-)similarity measure between two genomes, e.g. in order to construct a phylogeny. All the existing measures are defined on genomes without duplicates. However, we know that genes can be duplicated within the same genome. One possible approach to overcome this difficulty is to establish a one-to-one correspondence (i.e. a matching) between genes of both genomes, where the correspondence is chosen in order to optimize the studied measure. In this paper, we are interested in three measures (number of breakpoints, number of common intervals and number of conserved intervals) and three models of matching (exemplar, intermediate and maximum matching models). We prove that, for each model and each measure M, computing a matching between two genomes that optimizes M is APX-hard. We also study the complexity of the following problem: is there an exemplarization (resp. an intermediate/maximum matching) that induces no breakpoint? We prove the problem to be NP-Complete in the exemplar model for a new class of instances, and we show that the problem is in P in the maximum matching model. We also focus on a fourth measure: the number of adjacencies, for which we give several approximation algorithms in the maximum matching model, in the case where genomes contain the same number of duplications of each gene.
Liste complète des métadonnées

Cited literature [25 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00285511
Contributor : Guillaume Fertin <>
Submitted on : Friday, June 6, 2008 - 9:30:11 AM
Last modification on : Wednesday, May 23, 2018 - 3:44:02 PM
Document(s) archivé(s) le : Friday, May 28, 2010 - 9:16:33 PM

Files

article.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00285511, version 1
  • ARXIV : 0806.1103

Citation

Sébastien Angibaud, Guillaume Fertin, Irena Rusu, Annelyse Thevenin, Stéphane Vialette. On the Approximability of Comparing Genomes with Duplicates. 2008. ⟨hal-00285511⟩

Share

Metrics

Record views

679

Files downloads

241