IRISA participation to BioNLP-ST 2013: lazy-learning and information retrieval for information extraction tasks

Vincent Claveau 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This paper describes the information extraction techniques developed in the framework of the participation of IRISA-TexMex to the following BioNLP-ST13 tasks: Bacterial Biotope subtasks 1 and 2, and Graph Regulation Network. The approaches developed are general-purpose ones and do not rely on specialized preprocessing, nor specialized external data, and they are expected to work independently of the domain of the texts processed. They are classically based on machine learning techniques, but we put the emphasis on the use of similarity measures inherited from the information retrieval domain (Okapi-BM25 (Robertson et al., 1998), language modeling (Hiemstra, 1998)). Through the good results obtained for these tasks, we show that these simple settings are competitive provided that the representation and similarity chosen are well suited for the task.
Complete list of metadatas

Cited literature [13 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00912308
Contributor : Vincent Claveau <>
Submitted on : Sunday, December 1, 2013 - 10:27:31 PM
Last modification on : Friday, November 16, 2018 - 1:23:34 AM
Long-term archiving on : Monday, March 3, 2014 - 8:46:56 PM

File

Claveau_bioNLP13.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-00912308, version 1

Citation

Vincent Claveau. IRISA participation to BioNLP-ST 2013: lazy-learning and information retrieval for information extraction tasks. BioNLP Workshop, colocated with ACL 2013, Aug 2013, Bulgaria. pp.188-196. ⟨hal-00912308⟩

Share

Metrics

Record views

349

Files downloads

179