DOMO: a new database of aligned protein domains - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Trends in Biochemical Sciences Année : 1998

DOMO: a new database of aligned protein domains

Jérôme Gracy
Patrick Argos
  • Fonction : Auteur

Résumé

Domains are autonomously folding units which are combined into modular proteins 1. At a sequence level, accurately delineating the boundaries of homologous protein domains is essential for multiple sequence alignment. Tertiary structural data that could guide visual determination of such domain boundaries are not available for most proteins. Consequently, although many motif 2 , block 3 , and full-sequence-alignment 4 databases exist, as yet there are only two domain-alignment databases that have been constructed by a fully automated process utilizing only sequence information 5,6. Here, we describe DOMO, a new database containing 8877 multiple sequence alignments, including 99 058 protein domains as well as repeating-sequence regions extracted from 83 054 non-redundant amino acid sequences from the SWISS-PROT 7 and PIR 8 databases. The domain boundaries and alignments were generated by a fully automated analysis process that involves the detection and clustering of amino acid sequence similarities and, subsequently, delineation of the domain boundaries and multiple sequence alignment of related protein segments 9,10. The domain boundaries were not inferred from three-dimensional data. Instead, the relative positions of homologous segment pairs within the same protein (for repeats) or within homologous proteins with regard to each protein's Nor C-terminus were used to define the domain boundaries. The completeness and accuracy of the protein classifications, the correctness of the domain boundaries, and the quality of the multiple sequence alignments are greatly improved in DOMO, in comparison to other databases 9,10.
Fichier principal
Vignette du fichier
tibs4.pdf (180.67 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02358836 , version 1 (12-11-2019)

Identifiants

Citer

Jérôme Gracy, Patrick Argos. DOMO: a new database of aligned protein domains. Trends in Biochemical Sciences, 1998, 23 (12), pp.495-497. ⟨10.1016/S0968-0004(98)01294-8⟩. ⟨hal-02358836⟩
44 Consultations
169 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More