Towards a reproducible interactome: semantic-based detection of redundancies to unify protein-protein interaction databases - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Bioinformatics Année : 2022

Towards a reproducible interactome: semantic-based detection of redundancies to unify protein-protein interaction databases

Résumé

Motivation: Information on protein-protein interactions is collected in numerous primary databases with their own curation process. Several meta-databases aggregate primary databases to provide more exhaustive datasets. In addition to exhaustivity, aggregation contributes to reliability by providing an overview of the various studies and detection methods supporting an interaction. However, interactions listed in different primary databases are partly redundant because some publications reporting protein-protein interactions have been curated by multiple primary databases. Mere aggregation can thus introduce a bias if these redundancies are not identified and eliminated. To overcome this bias, meta-databases rely on the Molecular Interaction ontology that describes interaction detection methods, but they do not fully take advantage of the ontology’s rich semantics, which leads to systematically overestimating interaction reproducibility.Results: We propose a precise definition of explicit and implicit redundancy, and show that both can be easily detected using Semantic Web technologies. We apply this process to a dataset from the APID meta-database and show that while explicit redundancies were detected by the APID aggregation process, about 15% of APID entries are implicitly redundant and should not be taken into account when presenting confidence-related metrics. More than 90% of implicit redundancies result from the aggregation of distinct primary databases, while the remaining occurs between entries of a single database. Finally, we build a” reproducible interactome” with interactions that have been reproduced by multiple methods or publications. The size of the reproducible interactome is drastically impacted by removing redundancies for both yeast (-59%) and human (-56%), and we show that this is largely due to implicit redundancies.
Fichier principal
Vignette du fichier
mmelkonian_final (1).pdf (522.34 Ko) Télécharger le fichier
mmelkonian_revision_supplementary_no_highlights.pdf (466.36 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03522989 , version 1 (12-01-2022)

Identifiants

Citer

Marc Melkonian, Camille Juigné, Olivier Dameron, Gwenaël Rabut, Emmanuelle Becker. Towards a reproducible interactome: semantic-based detection of redundancies to unify protein-protein interaction databases. Bioinformatics, 2022, Proceedings, 38 (6), pp.1-7. ⟨10.1093/bioinformatics/btac013⟩. ⟨hal-03522989⟩
197 Consultations
151 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More