DOMO: a new database of aligned protein domains

Jérôme Gracy; Patrick Argos

doi:10.1016/S0968-0004(98)01294-8

Article Dans Une Revue Trends in Biochemical Sciences Année : 1998

DOMO: a new database of aligned protein domains

(1) ,

Jérôme Gracy

Fonction : Auteur
PersonId : 21751
IdHAL : jerome-gracy
ORCID : 0000-0002-9778-0327

Centre de biochimie structurale

Patrick Argos

Fonction : Auteur

Résumé

Domains are autonomously folding units which are combined into modular proteins 1. At a sequence level, accurately delineating the boundaries of homologous protein domains is essential for multiple sequence alignment. Tertiary structural data that could guide visual determination of such domain boundaries are not available for most proteins. Consequently, although many motif 2 , block 3 , and full-sequence-alignment 4 databases exist, as yet there are only two domain-alignment databases that have been constructed by a fully automated process utilizing only sequence information 5,6. Here, we describe DOMO, a new database containing 8877 multiple sequence alignments, including 99 058 protein domains as well as repeating-sequence regions extracted from 83 054 non-redundant amino acid sequences from the SWISS-PROT 7 and PIR 8 databases. The domain boundaries and alignments were generated by a fully automated analysis process that involves the detection and clustering of amino acid sequence similarities and, subsequently, delineation of the domain boundaries and multiple sequence alignment of related protein segments 9,10. The domain boundaries were not inferred from three-dimensional data. Instead, the relative positions of homologous segment pairs within the same protein (for repeats) or within homologous proteins with regard to each protein's Nor C-terminus were used to define the domain boundaries. The completeness and accuracy of the protein classifications, the correctness of the domain boundaries, and the quality of the multiple sequence alignments are greatly improved in DOMO, in comparison to other databases 9,10.

Domaines

Bio-Informatique, Biologie Systémique [q-bio.QM] Biologie structurale [q-bio.BM]

Fichier principal

tibs4.pdf (180.67 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Jerome Gracy : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02358836

Soumis le : mardi 12 novembre 2019-11:01:36

Dernière modification le : vendredi 24 mars 2023-14:53:13

Archivage à long terme le : jeudi 13 février 2020-16:33:49

Dates et versions

hal-02358836 , version 1 (12-11-2019)

Identifiants

HAL Id : hal-02358836 , version 1
DOI : 10.1016/S0968-0004(98)01294-8

Citer

Jérôme Gracy, Patrick Argos. DOMO: a new database of aligned protein domains. Trends in Biochemical Sciences, 1998, 23 (12), pp.495-497. ⟨10.1016/S0968-0004(98)01294-8⟩. ⟨hal-02358836⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-MONTPELLIER

44 Consultations

169 Téléchargements

DOMO: a new database of aligned protein domains

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager