Using shallow linguistic features for relation extraction in bio-medical texts

Ali-Reza Ebadat 1 Vincent Claveau 1 Pascale Sébillot 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In this paper2 , we model the corpus-based relation extraction task as a classification problem. We show that, in this framework, standard machine learning systems exploiting representations simply based on shallow linguistic information can rival state-of-the-art systems that rely on deep linguistic analysis. Even more effective systems can be obtained, still using these easy and reliable pieces of information, if the specifics of the extraction task and the data are taken into account. Our original method combining lazy learning and language modeling out-performs the existing systems when evaluated on the LLL2005 protein-protein interaction extraction task data.
Document type :
Conference papers
Complete list of metadatas
Contributor : Vincent Claveau <>
Submitted on : Wednesday, November 23, 2011 - 3:10:09 PM
Last modification on : Friday, November 16, 2018 - 1:24:23 AM


  • HAL Id : hal-00644070, version 1


Ali-Reza Ebadat, Vincent Claveau, Pascale Sébillot. Using shallow linguistic features for relation extraction in bio-medical texts. Traitement Automatique des Langues Naturelles, TALN, 2011, Montpellier, France. 125-130, short paper. ⟨hal-00644070⟩



Record views