Extracting protein-protein interactions with language modeling

Ali-Reza Ebadat 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In this paper, we model the corpus-based relation extraction task, namely protein- protein interaction, as a classification problem. In that framework, we first show that standard machine learning systems exploiting representations simply based on shallow linguistic information can rival state-of-the-art systems that rely on deep linguistic analysis. We also show that it is possible to obtain even more effective systems, still using these easy and reliable pieces of information, if the specifics of the extraction task and the data are taken into account. Our original method com- bining lazy learning and language mod- elling out-performs the existing systems when evaluated on the LLL2005 protein- protein interaction extraction task data.
Document type :
Conference papers
Complete list of metadatas

Cited literature [22 references]  Display  Hide  Download

Contributor : Vincent Claveau <>
Submitted on : Sunday, August 24, 2014 - 11:40:17 PM
Last modification on : Friday, November 16, 2018 - 1:25:34 AM
Long-term archiving on : Thursday, November 27, 2014 - 2:01:03 PM


Files produced by the author(s)


  • HAL Id : hal-01057652, version 1


Ali-Reza Ebadat. Extracting protein-protein interactions with language modeling. Student research workshop in conjunction with RANLP 2011, Sep 2011, Bulgaria. ⟨hal-01057652⟩



Record views


Files downloads