Query selection methods for automated corpora construction with a use case in food-drug interactions - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Query selection methods for automated corpora construction with a use case in food-drug interactions

Résumé

In this paper, we address the problem of automatically constructing a relevant corpus of scientific articles about food-drug interactions. There is a growing number of scientific publications that describe food-drug interactions but currently building a high-coverage corpus that can be used for information extraction purposes is not trivial. We investigate several methods for automating the query selection process using an expert-curated corpus of food-drug interactions. Our experiments show that index term features along with a decision tree classifier are the best approach for this task and that feature selection approaches and in particular gain ratio outperform frequency-based methods for query selection.
Fichier principal
Vignette du fichier
BordeaG-2019-Query-Article_de_colloque.pdf (358.34 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-02371207 , version 1 (31-03-2021)

Licence

Paternité

Identifiants

Citer

Georgeta Bordea, Tsanta Randriatsitohaina, Natalia Grabar, Fleur Mougin, Thierry Hamon. Query selection methods for automated corpora construction with a use case in food-drug interactions. ACL Workshop on Biomedical Natural Language Processing, Aug 2019, Florence, Italy. pp.115-124, ⟨10.18653/v1/W19-5013⟩. ⟨hal-02371207⟩
105 Consultations
55 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More