P. Harvey and M. Pagel, The Comparative Method in Evolutionary Biology, 1991.

C. Dunn, A. Hejnol, D. Matus, K. Pang, and W. Browne, Broad phylogenomic sampling improves resolution of the animal tree of life, Nature, vol.22, issue.7188, pp.745-749, 2008.
DOI : 10.1093/oxfordjournals.molbev.a026334

Y. Bao, P. Bolotov, D. Dernovoy, B. Kiryutin, and L. Zaslavsky, The Influenza Virus Resource at the National Center for Biotechnology Information, Journal of Virology, vol.82, issue.2, pp.596-601, 2008.
DOI : 10.1128/JVI.02005-07

R. Kuipers, H. Joosten, W. Van-berkel, N. Leferink, and E. Rooijen, 3DM: Systematic analysis of heterogeneous superfamily data to discover protein functionalities, Proteins: Structure, Function, and Bioinformatics, vol.1, pp.2101-2113, 2010.
DOI : 10.1002/prot.22725

S. Singh, R. Tokhunts, V. Baubet, J. Goetz, and Z. Huang, Sonic??hedgehog mutations identified in holoprosencephaly patients can act in a dominant negative manner, Human Genetics, vol.106, issue.Pt 23, pp.95-103, 2009.
DOI : 10.1007/s00431-004-1459-0

J. Zhang, X. Chen, M. Kent, C. Rodriguez, and X. Chen, Establishment of a Dog Model for the p53 Family Pathway and Identification of a Novel Isoform of p21 Cyclin-Dependent Kinase Inhibitor, Molecular Cancer Research, vol.7, issue.1, pp.67-78, 2009.
DOI : 10.1158/1541-7786.MCR-08-0347

M. Eaton, A. Martin, J. Thorbjarnarson, and G. Amato, Species-level diversification of African dwarf crocodiles (Genus Osteolaemus): A geographic and phylogenetic perspective, Molecular Phylogenetics and Evolution, vol.50, issue.3, pp.496-506, 2009.
DOI : 10.1016/j.ympev.2008.11.009

A. Levasseur, P. Pontarotti, O. Poch, and J. Thompson, Strategies for Reliable Exploitation of Evolutionary Concepts in High Throughput Biology, Evolutionary Bioinformatics, vol.4, pp.121-137, 2008.
DOI : 10.4137/EBO.S597

URL : https://hal.archives-ouvertes.fr/inserm-00368022

K. Wong, M. Suchard, and J. Huelsenbeck, Alignment Uncertainty and Genomic Analysis, Science, vol.43, issue.25, pp.473-476, 2008.
DOI : 10.1073/pnas.2036252100

A. Löytynoja and N. Goldman, Phylogeny-Aware Gap Placement Prevents Errors in Sequence Alignment and Evolutionary Analysis, Science, vol.319, issue.5862, pp.1632-1635, 2008.
DOI : 10.1126/science.1151532

D. Brown, N. Krishnamurthy, and K. Sjolander, Automated Protein Subfamily Identification and Classification, PLoS Computational Biology, vol.32, issue.8, p.160, 2007.
DOI : 10.1371/journal.pcbi.0030160.sd006

URL : http://doi.org/10.1371/journal.pcbi.0030160

B. Brandt, K. Feenstra, and J. Heringa, Multi-Harmony: detecting functional specificity from sequence alignment, Nucleic Acids Research, vol.38, issue.Web Server, pp.35-40, 2010.
DOI : 10.1093/nar/gkq415

URL : https://academic.oup.com/nar/article-pdf/38/suppl_2/W35/3872358/gkq415.pdf

A. Rausell, D. Juan, F. Pazos, and A. Valencia, Protein interactions and ligand binding: From protein subfamilies to functional specificity, Proceedings of the National Academy of Sciences, vol.42, issue.1, pp.1995-2000, 2010.
DOI : 10.1002/1097-0134(20010101)42:1<108::AID-PROT110>3.0.CO;2-O

URL : http://www.pnas.org/content/107/5/1995.full.pdf

D. Feng and R. Doolittle, Progressive sequence alignment as a prerequisitetto correct phylogenetic trees, Journal of Molecular Evolution, vol.360, issue.4, pp.351-360, 1987.
DOI : 10.1515/bchm2.1979.360.2.1879

J. Thompson, F. Plewniak, and O. Poch, BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs, Bioinformatics, vol.15, issue.1, pp.87-88, 1999.
DOI : 10.1093/bioinformatics/15.1.87

P. Gardner, A. Wilm, and S. Washietl, A benchmark of multiple sequence alignment programs upon structural RNAs, Nucleic Acids Research, vol.33, issue.8, pp.2433-2439, 2005.
DOI : 10.1093/nar/gki541

O. Gotoh, Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural Alignments, Journal of Molecular Biology, vol.264, issue.4, pp.823-838, 1996.
DOI : 10.1006/jmbi.1996.0679

S. Eddy, Profile hidden Markov models, Bioinformatics, vol.14, issue.9, pp.755-763, 1998.
DOI : 10.1093/bioinformatics/14.9.755

C. Notredame and D. Higgins, SAGA: sequence alignment by genetic algorithm, Nucleic Acids Research, vol.24, issue.8, pp.1515-1524, 1996.
DOI : 10.1093/nar/24.8.1515

URL : https://academic.oup.com/nar/article-pdf/24/8/1515/6964040/24-8-1515.pdf

J. Thompson, F. Plewniak, and O. Poch, A comprehensive comparison of multiple sequence alignment programs, Nucleic Acids Research, vol.27, issue.13, pp.2682-2690, 1999.
DOI : 10.1093/nar/27.13.2682

G. Blackshields, I. Wallace, M. Larkin, and D. Higgins, Analysis and comparison of benchmarks for multiple sequence alignment, In Silico Biol, vol.6, pp.321-339, 2006.

I. Wallace, O. Sullivan, O. Higgins, D. Notredame, and C. , M-Coffee: combining multiple sequence alignment methods with T-Coffee, Nucleic Acids Research, vol.34, issue.6, pp.1692-1699, 2006.
DOI : 10.1093/nar/gkl091

URL : https://academic.oup.com/nar/article-pdf/34/6/1692/7128257/gkl091.pdf

K. Katoh and H. Toh, Recent developments in the MAFFT multiple sequence alignment program, Briefings in Bioinformatics, vol.9, issue.4, pp.286-298, 2008.
DOI : 10.1093/bib/bbn013

R. Edgar, MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Research, vol.32, issue.5, pp.1792-1797, 2004.
DOI : 10.1093/nar/gkh340

URL : https://academic.oup.com/nar/article-pdf/32/5/1792/7055030/gkh340.pdf

C. Do, M. Mahabhashyam, M. Brudno, and S. Batzoglou, ProbCons: Probabilistic consistency-based multiple sequence alignment, Genome Research, vol.15, issue.2, pp.330-340, 2005.
DOI : 10.1101/gr.2821705

URL : http://genome.cshlp.org/content/15/2/330.full.pdf

O. Sullivan, O. Suhre, K. Abergel, C. Higgins, D. Notredame et al., 3DCoffee: Combining Protein Sequences and Structures within Multiple Sequence Alignments, Journal of Molecular Biology, vol.340, issue.2, pp.385-395, 2004.
DOI : 10.1016/j.jmb.2004.04.058

S. Chakrabarti, C. Lanczycki, A. Panchenko, T. Przytycka, and P. Thiessen, Refining multiple sequence alignments with conserved core regions, Nucleic Acids Research, vol.34, issue.9, pp.2598-2606, 2006.
DOI : 10.1093/nar/gkl274

URL : https://academic.oup.com/nar/article-pdf/34/9/2598/7129280/gkl274.pdf

C. Lee, C. Grasso, and M. Sharlow, Multiple sequence alignment using partial order graphs, Bioinformatics, vol.18, issue.3, pp.452-464, 2002.
DOI : 10.1093/bioinformatics/18.3.452

URL : https://academic.oup.com/bioinformatics/article-pdf/18/3/452/648375/180452.pdf

G. Raghava, S. Searle, P. Audley, J. Barber, and G. Barton, OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy, BMC Bioinformatics, vol.4, issue.1, p.47, 2003.
DOI : 10.1186/1471-2105-4-47

C. Dessimoz and M. Gil, Phylogenetic assessment of alignments reveals neglected tree signal in gaps, Genome Biology, vol.11, issue.4, p.37, 2010.
DOI : 10.1186/gb-2010-11-4-r37

M. Aniba, O. Poch, and J. Thompson, Issues in bioinformatics benchmarking: the case study of multiple sequence alignment, Nucleic Acids Research, vol.38, issue.21, pp.7353-7363, 2010.
DOI : 10.1093/nar/gkq625

E. Koonin, Darwinian evolution in the light of genomics, Nucleic Acids Research, vol.37, issue.4, pp.1011-1034, 2009.
DOI : 10.1093/nar/gkp089

P. Bakke, N. Carney, W. Deloache, M. Gearing, and K. Ingvorsen, Evaluation of Three Automated Genome Annotations for Halorhabdus utahensis, PLoS ONE, vol.4, issue.7, p.6291, 2009.
DOI : 10.1371/journal.pone.0006291.t006

O. Keller, F. Odronitz, M. Stanke, M. Kollmar, and S. Waack, Scipio: Using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species, BMC Bioinformatics, vol.9, issue.1, p.278, 2008.
DOI : 10.1186/1471-2105-9-278

R. Guigo, P. Flicek, J. Abril, A. Reymond, and J. Lagarde, EGASP: the human ENCODE Genome Annotation Assessment Project, Genome Biol, vol.7, 2006.

E. Mardis, The impact of next-generation sequencing technology on genetics, Trends in Genetics, vol.24, issue.3, pp.133-141, 2008.
DOI : 10.1016/j.tig.2007.12.007

M. Pop and S. Salzberg, Bioinformatics challenges of new sequencing technology, Trends in Genetics, vol.24, issue.3, pp.142-149, 2008.
DOI : 10.1016/j.tig.2007.12.006

URL : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2680276/pdf

A. Dunker, C. Oldfield, J. Meng, P. Romero, and J. Yang, The unfoldomics decade: an update on intrinsically disordered proteins, BMC Genomics, vol.9, issue.Suppl 2, p.1, 2008.
DOI : 10.1186/1471-2164-9-S2-S1

URL : https://bmcgenomics.biomedcentral.com/track/pdf/10.1186/1471-2164-9-S2-S1?site=bmcgenomics.biomedcentral.com

W. Wong, S. Maurer-stroh, and F. Eisenhaber, More Than 1,001 Problems with Protein Domain Databases: Transmembrane Regions, Signal Peptides and the Issue of Sequence Homology, PLoS Computational Biology, vol.7, issue.7, p.1000867, 2010.
DOI : 10.1371/journal.pcbi.1000867.s006

J. Thompson, F. Plewniak, R. Ripp, J. Thierry, and O. Poch, Towards a reliable objective function for multiple sequence alignments, J Mol Biol, vol.4, pp.937-951, 2001.
DOI : 10.1006/jmbi.2001.5187

L. Bianchetti, J. Thompson, O. Lecompte, F. Plewniak, and O. Poch, vALId: VALIDATION OF PROTEIN SEQUENCE QUALITY BASED ON MULTIPLE ALIGNMENT DATA, Journal of Bioinformatics and Computational Biology, vol.16, issue.04, pp.929-947, 2005.
DOI : 10.1093/nar/29.1.255

URL : https://hal.archives-ouvertes.fr/hal-00187446

L. Krause, N. Diaz, D. Bartels, R. Edwards, and A. Pühler, Finding novel genes in bacterial communities isolated from the environment, Bioinformatics, vol.22, issue.14, pp.281-289, 2006.
DOI : 10.1093/bioinformatics/btl247

D. Huson, A. Auch, J. Qi, and S. Schuster, MEGAN analysis of metagenomic data, Genome Research, vol.17, issue.3, pp.377-286, 2007.
DOI : 10.1101/gr.5969107

URL : http://genome.cshlp.org/content/17/3/377.full.pdf

C. Chica, A. Labarga, C. Gould, R. López, and T. Gibson, A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences, BMC Bioinformatics, vol.9, issue.1, p.229, 2008.
DOI : 10.1186/1471-2105-9-229

S. Sankararaman and K. Sjölander, INTREPID???INformation-theoretic TREe traversal for Protein functional site IDentification, Bioinformatics, vol.24, issue.21, pp.2445-2452, 2008.
DOI : 10.1093/bioinformatics/btn474

URL : https://academic.oup.com/bioinformatics/article-pdf/24/21/2445/16883513/btn474.pdf

P. Amaral, M. Dinger, T. Mercer, and J. Mattick, The Eukaryotic Genome as an RNA Machine, Science, vol.17, issue.12, pp.1787-1789, 2008.
DOI : 10.1093/hmg/ddm352

Y. Koh and N. Rountree, Rare Association Rule Mining And Knowledge Discovery: Technologies For Infrequent And Critical Event Detection, 2009.
DOI : 10.4018/978-1-60566-754-6

V. Simossis and J. Heringa, PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information, Nucleic Acids Research, vol.33, issue.Web Server, pp.289-294, 2005.
DOI : 10.1093/nar/gki390

URL : https://academic.oup.com/nar/article-pdf/33/suppl_2/W289/7622741/gki390.pdf

J. Pei and N. Grishin, PROMALS: towards accurate multiple sequence alignments of distantly related proteins, Bioinformatics, vol.23, issue.7, pp.802-808, 2007.
DOI : 10.1093/bioinformatics/btm017

URL : https://academic.oup.com/bioinformatics/article-pdf/23/7/802/16861558/btm017.pdf

J. Thompson, P. Koehl, R. Ripp, and O. Poch, BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark, Proteins: Structure, Function, and Bioinformatics, vol.11, issue.Suppl 2, pp.127-136, 2005.
DOI : 10.1107/S0567739478001680

URL : https://hal.archives-ouvertes.fr/hal-00187785

A. Schä-ffer, L. Aravind, T. Madden, S. Shavirin, and J. Spouge, Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements, Nucleic Acids Research, vol.29, issue.14, pp.2994-3005, 2001.
DOI : 10.1093/nar/29.14.2994

H. Berman, The Protein Data Bank: a historical perspective, Acta Crystallographica Section A Foundations of Crystallography, vol.21, issue.1, pp.88-95, 2008.
DOI : 10.1107/S0108767307035623

URL : http://journals.iucr.org/a/issues/2008/01/00/sc5004/sc5004.pdf

W. Taylor, Protein Structure Comparison Using SAP, Methods Mol Biol, vol.143, pp.19-32, 2000.
DOI : 10.1385/1-59259-368-2:19

F. Plewniak, J. Thompson, and O. Poch, Ballast: Blast post-processing based on locally conserved segments, Bioinformatics, vol.16, issue.9, pp.750-759, 2000.
DOI : 10.1093/bioinformatics/16.9.750

URL : https://academic.oup.com/bioinformatics/article-pdf/16/9/750/873628/160750.pdf

J. Thompson, F. Plewniak, J. Thierry, and O. Poch, DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches, Nucleic Acids Research, vol.28, issue.15, pp.2919-2926, 2000.
DOI : 10.1093/nar/28.15.2919

URL : https://academic.oup.com/nar/article-pdf/28/15/2919/9904662/282919.pdf

J. Thompson, V. Prigent, and O. Poch, LEON: multiple aLignment Evaluation Of Neighbours, Nucleic Acids Research, vol.32, issue.4, pp.1298-1307, 2004.
DOI : 10.1093/nar/gkh294

URL : https://academic.oup.com/nar/article-pdf/32/4/1298/6258592/gkh294.pdf

J. Thompson, A. Muller, A. Waterhouse, J. Procter, and G. Barton, MACSIMS: multiple alignment of complete sequences information management system, BMC Bioinformatics, vol.7, issue.1, p.318, 2006.
DOI : 10.1186/1471-2105-7-318

URL : https://hal.archives-ouvertes.fr/hal-00188166

A. Waterhouse, J. Procter, D. Martin, M. Clamp, and G. Barton, Jalview Version 2--a multiple sequence alignment editor and analysis workbench, Bioinformatics, vol.25, issue.9, pp.1189-1191, 2009.
DOI : 10.1093/bioinformatics/btp033

URL : https://academic.oup.com/bioinformatics/article-pdf/25/9/1189/526576/btp033.pdf

J. Thompson, J. Thierry, and O. Poch, RASCAL: rapid scanning and correction of multiple sequence alignments, Bioinformatics, vol.19, issue.9, pp.1155-1161, 2003.
DOI : 10.1093/bioinformatics/btg133

URL : https://academic.oup.com/bioinformatics/article-pdf/19/9/1155/801480/btg133.pdf

N. Wicker, G. Perrin, J. Thierry, and O. Poch, Secator: A Program for Inferring Protein Subfamilies from Phylogenetic Trees, Molecular Biology and Evolution, vol.18, issue.8, pp.1435-1441, 2001.
DOI : 10.1093/oxfordjournals.molbev.a003929

URL : https://academic.oup.com/mbe/article-pdf/18/8/1435/3158237/mbev_18_08_1435.pdf

M. Gribskov, A. Mclachlan, and D. Eisenberg, Profile analysis: detection of distantly related proteins., Proceedings of the National Academy of Sciences, vol.84, issue.13, pp.4355-4358, 1987.
DOI : 10.1073/pnas.84.13.4355

URL : http://www.pnas.org/content/84/13/4355.full.pdf

J. Thompson, T. Gibson, F. Plewniak, F. Jeanmougin, and D. Higgins, The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools, Nucleic Acids Research, vol.25, issue.24, pp.4876-4882, 1997.
DOI : 10.1093/nar/25.24.4876

URL : https://academic.oup.com/nar/article-pdf/25/24/4876/4897770/25-24-4876.pdf

M. Vingron and P. Sibbald, Weighting in sequence space: a comparison of methods in terms of generalized sequences., Proceedings of the National Academy of Sciences, vol.90, issue.19, pp.8777-8781, 1993.
DOI : 10.1073/pnas.90.19.8777

Z. Dosztányi, V. Csizmok, P. Tompa, and I. Simon, IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content, Bioinformatics, vol.21, issue.16, pp.3433-3434, 2005.
DOI : 10.1093/bioinformatics/bti541

J. Thompson, S. Holbrook, K. Katoh, P. Koehl, and D. Moras, MAO: a Multiple Alignment Ontology for nucleic acid and protein sequences, Nucleic Acids Research, vol.33, issue.13, pp.4164-4171, 2005.
DOI : 10.1093/nar/gki735

URL : https://hal.archives-ouvertes.fr/hal-00187784

M. Larkin, G. Blackshields, N. Brown, R. Chenna, and P. Mcgettigan, Clustal W and Clustal X version 2.0, Bioinformatics, vol.23, issue.21, pp.2947-2948, 2007.
DOI : 10.1093/bioinformatics/btm404

URL : https://hal.archives-ouvertes.fr/hal-00206210

A. Subramanian, M. Kaufmann, and B. Morgenstern, DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment, Algorithms for Molecular Biology, vol.3, issue.1, p.6, 2008.
DOI : 10.1186/1748-7188-3-6

URL : https://almob.biomedcentral.com/track/pdf/10.1186/1748-7188-3-6?site=almob.biomedcentral.com

T. Lassmann, O. Frings, . Sonnhammer, and . El, Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features, Nucleic Acids Research, vol.37, issue.3, pp.858-865, 2009.
DOI : 10.1093/nar/gkn1006

URL : https://academic.oup.com/nar/article-pdf/37/3/858/17058766/gkn1006.pdf

K. Katoh and H. Toh, Recent developments in the MAFFT multiple sequence alignment program, Briefings in Bioinformatics, vol.9, issue.4, pp.286-298, 2008.
DOI : 10.1093/bib/bbn013

R. Edgar, MUSCLE: a multiple sequence alignment method with reduced time and space complexity, BMC Bioinformatics, vol.5, issue.1, p.113, 2004.
DOI : 10.1186/1471-2105-5-113

C. Notredame, D. Higgins, and J. Heringa, T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. Thornton, Journal of Molecular Biology, vol.302, issue.1, pp.205-217, 2000.
DOI : 10.1006/jmbi.2000.4042

C. Do, M. Mahabhashyam, M. Brudno, and S. Batzoglou, ProbCons: Probabilistic consistency-based multiple sequence alignment, Genome Research, vol.15, issue.2, pp.330-340, 2005.
DOI : 10.1101/gr.2821705

URL : http://genome.cshlp.org/content/15/2/330.full.pdf