G. Apic, J. Gough, and S. Teichmann, Domain combinations in archaeal, eubacterial and eukaryotic proteomes, Journal of Molecular Biology, vol.310, issue.2, pp.310-311, 2001.
DOI : 10.1006/jmbi.2001.4776

A. Bahl, PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data, Nucleic Acids Research, vol.31, issue.1, pp.212-215, 2003.
DOI : 10.1093/nar/gkg081

O. Bastien, S. Roy, and E. Marechal, Construction of non-symmetric substitution matrices derived from proteomes with biased amino acid distributions, Comptes Rendus Biologies, vol.328, issue.5, pp.445-453, 2005.
DOI : 10.1016/j.crvi.2005.02.002

F. Beaussart, J. 3rd-weiner, and E. Bornberg-bauer, Automated Improvement of Domain ANnotations using context analysis of domain arrangements (AIDAN), Bioinformatics, vol.23, issue.14, pp.23-1834, 2007.
DOI : 10.1093/bioinformatics/btm240

Y. Benjamini and Y. Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society, vol.85, pp.289-300, 1995.

I. Callebaut, K. Prat, E. Meurice, J. Mornon, and S. Tomavo, Prediction of the general transcription factors associated with rna polymerase ii in Plasmodium falciparum: conserved features and differences relative to other eucaryotes, BMC Genomics, vol.6, issue.1, p.100, 2005.
DOI : 10.1186/1471-2164-6-100

I. Cohen-gihon, R. Nussinov, S. , and R. , Comprehensive analysis of co-occuring domain sets in yeast proteins, BMC Genomics, issue.8, pp.11-161, 2007.

L. Coin, A. Bateman, and R. Durbin, Enhanced protein domain discovery by using language modeling techniques from speech recognition, Proceedings of the National Academy of Sciences, vol.100, issue.8, pp.4516-4520, 2003.
DOI : 10.1073/pnas.0737502100

R. Coulson, N. Hall, and A. Ouzonis, Comparative Genomics of Transcriptional Control in the Human Malaria Parasite Plasmodium falciparum, Genome Research, vol.14, issue.8, pp.1548-1554, 2004.
DOI : 10.1101/gr.2218604

R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological sequence analysis: Probabilistic models of proteins and nucleic acids, 1998.
DOI : 10.1017/CBO9780511790492

S. Eddy, Profile hidden Markov models, Bioinformatics, vol.14, issue.9, pp.755-763, 1998.
DOI : 10.1093/bioinformatics/14.9.755

R. Finn, The Pfam protein families database, Nucleic Acids Research, vol.36, issue.Database, pp.281-288, 2008.
DOI : 10.1093/nar/gkm960

URL : https://hal.archives-ouvertes.fr/hal-01294685

K. Forslund and E. Sonnhammer, Predicting protein function from domain content, Bioinformatics, vol.24, issue.15, pp.1681-1687, 2008.
DOI : 10.1093/bioinformatics/btn312

URL : http://bioinformatics.oxfordjournals.org/cgi/content/short/25/9/1214

L. Geer, M. Domrachev, D. Lipman, and S. Bryant, CDART: Protein Homology by Domain Architecture, Genome Research, vol.12, issue.10, pp.1619-1623, 2002.
DOI : 10.1101/gr.278202

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC187533

G. Ontology and C. , Gene ontology: tool for the unification of biology, Nature Genet, vol.25, pp.25-29, 2000.

M. Gerstein and H. Hegyi, Annotation transfer for genomics: measuring functional divergence in multi-domain proteins, Genome Research, vol.11, issue.10, pp.1632-1640, 2001.

S. Kohler, A Plastid of Probable Green Algal Origin in Apicomplexan Parasites, Science, vol.275, issue.5305, pp.1485-1489, 1997.
DOI : 10.1126/science.275.5305.1485

S. Kummerfeld and S. Teichmann, Protein domain organisation: adding order, BMC Bioinformatics, vol.10, issue.1, p.10, 2009.
DOI : 10.1186/1471-2105-10-39

URL : http://doi.org/10.1186/1471-2105-10-39

W. Mclaughlin, K. Chen, T. Hou, W. , and W. , On the detection of functionally coherent groups of protein domains with an extension to protein annotation, BMC Bioinformatics, vol.8, issue.1, pp.16-390, 2007.
DOI : 10.1186/1471-2105-8-390

N. Mulder, R. Apweiler, T. Attwood, A. Bairoch, A. Bateman et al., New developments in the InterPro database, Nucleic Acids Research, vol.35, issue.Database, pp.224-228, 2007.
DOI : 10.1093/nar/gkl841

URL : https://hal.archives-ouvertes.fr/hal-00434830

A. Murzin, S. Brenner, T. Hubbard, C. , and C. , SCOP: A structural classification of proteins database for the investigation of sequences and structures, Journal of Molecular Biology, vol.247, issue.4, pp.536-540, 1995.
DOI : 10.1016/S0022-2836(05)80134-2

E. Pizzi and C. Frontali, Low-Complexity Regions in Plasmodium falciparum Proteins, Genome Research, vol.11, issue.2, pp.218-229, 2001.
DOI : 10.1101/gr.GR-1522R

A. Rambaut and N. Grassly, Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees, Bioinformatics, vol.13, issue.3, pp.235-238, 1997.
DOI : 10.1093/bioinformatics/13.3.235

J. Richardson, The Anatomy and Taxonomy of Protein Structure, Adv. Protein Chem, vol.34, pp.167-339, 1981.
DOI : 10.1016/S0065-3233(08)60520-3

M. Scott, D. Thomas, and M. Hallett, Predicting Subcellular Localization via Protein Motif Co-Occurrence, Genome Research, vol.14, issue.10a, pp.1957-1966, 2004.
DOI : 10.1101/gr.2650004

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC524420

B. Soriç, Statistical "Discoveries" and Effect-Size Estimation, Journal of the American Statistical Association, vol.84, issue.406, pp.608-610, 1989.
DOI : 10.2307/2289950

J. Weiner-3rd, F. Beaussart, and E. Bornberg-bauer, Domain deletions and substitutions in the modular protein evolution, FEBS Journal, vol.30, issue.Suppl. 1, pp.2037-2047, 2006.
DOI : 10.1111/j.1742-4658.2006.05220.x