On the Approximability of Comparing Genomes with Duplicates

Sébastien Angibaud; Guillaume Fertin; Irena Rusu

Pré-Publication, Document De Travail Année : 2007

On the Approximability of Comparing Genomes with Duplicates

(1) , (1) , (1)

Sébastien Angibaud

Fonction : Auteur
PersonId : 841085

Laboratoire d'Informatique de Nantes Atlantique

Guillaume Fertin

Fonction : Auteur
PersonId : 11485
IdHAL : guillaume-fertin
ORCID : 0000-0002-8251-2012
IdRef : 095050612

Laboratoire d'Informatique de Nantes Atlantique

Irena Rusu

Fonction : Auteur
PersonId : 16772
IdHAL : irena
IdRef : 095050671

Laboratoire d'Informatique de Nantes Atlantique

Résumé

A central problem in comparative genomics consists in computing a (dis-)similarity measure between two genomes. A large number of such measures has been proposed in the recent past: breakpoints, common intervals, SAD etc. In their initial definitions, all these measures suppose that genomes contain no duplicates. However, we now know that genes can be duplicated within the same genome. One possible approach to overcome this difficulty is to establish a matching between genes of both genomes in order to optimize the studied measure. Then, after a gene relabeling according to this matching and a deletion of the unmatched signed genes, two genomes without duplicates are obtained and the measure can be computed. In this paper, we are interested in three measures (number of breakpoints, common intervals or conserved intervals) and three models of matching (exemplar, maximum and non-maximum matching). We prove that, for each model and each measure, computing a matching between two genomes that optimizes the measure is APX-Hard. We show that this result remains true even for two genomes G1 and G2 such that G1 contains no duplicates and no gene of G2 appears more than twice. Finally, we propose a 4-approximation algorithm for a fourth measure, the number of adjacencies, under the maximum matching model, in the case where genomes contain the same number of duplications of each gene.

Domaines

Bio-informatique [q-bio.QM]

Fichier principal

angibaudfertinrusu.pdf (221.79 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Sébastien Angibaud : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00159893

Soumis le : mercredi 4 juillet 2007-14:13:47

Dernière modification le : vendredi 5 janvier 2024-03:25:27

Archivage à long terme le : jeudi 8 avril 2010-22:32:17

Dates et versions

hal-00159893 , version 1 (04-07-2007)

Identifiants

HAL Id : hal-00159893 , version 1

Citer

Sébastien Angibaud, Guillaume Fertin, Irena Rusu. On the Approximability of Comparing Genomes with Duplicates. 2007. ⟨hal-00159893⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-NANTES CNRS LINA LINA-COMBI LS2N NANTES-UNIVERSITE

205 Consultations

86 Téléchargements

On the Approximability of Comparing Genomes with Duplicates

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager