A 2-Approximation for the Minimum Duplication Speciation Problem
Résumé
We consider the following problem: given a set of gene family trees, spanning a given set of species, find a first speciation which splits these species into two subsets and minimizes the number of gene duplications that happened before this speciation. We call this problem the Minimum Duplication Bipartition Problem. Using a generalization of the Minimum Edge-Cut Problem, we propose a polynomial time 2-approximation algorithm for the Minimum Duplication Bipartition Problem. We apply this algorithm to the inference of species trees on synthetic datasets and on two datasets of eukaryotic species.