Skip to Main content Skip to Navigation
Journal articles

Inferring interaction partners from protein sequences

Abstract : Specific protein-protein interactions are crucial in the cell, both to ensure the formation and stability of multi-protein complexes, and to enable signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interaction partners, causing their sequences to be correlated. Here we exploit these correlations to accurately identify which proteins are specific interaction partners from sequence data alone. Our general approach, which employs a pairwise maximum entropy model to infer couplings between residues, has been successfully used to predict the three-dimensional structures of proteins from sequences. Thus inspired, we introduce an iterative algorithm to predict specific interaction partners from two protein families whose members are known to interact. We first assess the algorithm's performance on histidine kinases and response regulators from bacterial two-component signaling systems. We obtain a striking 0.93 true positive fraction on our complete dataset without any a priori knowledge of interaction partners, and we uncover the origin of this success. We then apply the algorithm to proteins from ATP-binding cassette (ABC) transporter complexes, and obtain accurate predictions in these systems as well. Finally, we present two metrics that accurately distinguish interacting protein families from non-interacting ones, using only sequence data. SIGNIFICANCE Specific protein-protein interactions play crucial roles in the stability of multi-protein complexes and in signal transduction. Thus, mapping these interactions is key to a systems-level understanding of cells. Systematic experimental identification of protein interaction partners is still challenging. However, a large and rapidly growing amount of sequence data is now available. Is it possible to identify which proteins interact just from their sequences? We propose an approach based on sequence covariation, building on methods used with success to predict the three-dimensional structures of proteins from sequences alone. Our method identifies specific interaction partners with high accuracy among the members of several ubiquitous prokaryotic protein families, and provides a way to predict protein-protein interactions directly from sequence data.
Document type :
Journal articles
Complete list of metadatas

Cited literature [28 references]  Display  Hide  Download
Contributor : Anne-Florence Bitbol <>
Submitted on : Thursday, January 11, 2018 - 4:05:16 PM
Last modification on : Monday, December 14, 2020 - 9:43:09 AM
Long-term archiving on: : Wednesday, May 23, 2018 - 5:23:37 PM


Publication funded by an institution



Anne-Florence Bitbol, Robert S. Dwyer, Lucy J. Colwell, Ned S. Wingreen. Inferring interaction partners from protein sequences. Proceedings of the National Academy of Sciences of the United States of America , National Academy of Sciences, 2016, 113 (43), pp.12180 - 12185. ⟨10.1073/pnas.1606762113⟩. ⟨hal-01636994⟩



Record views


Files downloads