Extracting food-drug interactions from scientific literature: relation clustering to address lack of data
Résumé
Food-Drug Interaction (FDI) occurs when food and drug are taken
simultaneously and cause unexpected effect. This paper tackles the
problem of mining scientific literature in order to extract these
interactions. We consider this problem as a relation extraction task
which can be solved with classification method. Since Food-Drug
Interactions need a fine-grained description with many
relation types, we face the data sparseness and the lack of examples
per type of relation. To address this issue, we propose an effective
approach for grouping relations sharing similar representation into
clusters and reducing the lack of examples. Cluster labels are then
used as labels of the dataset given to classifiers for the FDI type
identification. Our approach, relying on the extraction of relevant
features before, between, and after the entities associated by the
relation, improves significantly the performance of the FDI
classification. Finally, we contrast an intuitive grouping method
based on the definition of the relation types and a unsupervised
clustering based on the instances of each relation type.