Skip to Main content Skip to Navigation
Conference papers

Active Learning for Interactive Relation Extraction in a French Newspaper's Articles

Abstract : Relation extraction is a subtask of natural language processing that has seen many improvements in recent years, with the advent of complex pre-trained architectures. Many of these state-of-the-art approaches are tested against benchmarks with labelled sentences containing tagged entities, and require important pretraining and fine-tuning on task-specific data. However, in a real use-case scenario such as in a newspaper company mostly dedicated to local information, relations are of varied, highly specific type, with virtually no annotated data for such relations, and many entities co-occur in a sentence without being related. We question the use of supervised state-of-the-art models in such a context, where resources such as time, computing power and human annotators are limited. To adapt to these constraints, we experiment with an active-learning based relation extraction pipeline, consisting of a binary LSTM-based lightweight model for detecting the relations that do exist, and a state-of-the-art model for relation classification. We compare several choices for classification models in this scenario, from basic word embedding averaging, to graph neural networks and Bert-based ones, as well as several active learning acquisition strategies, in order to find the most costefficient yet accurate approach in our French largest daily newspaper company's use case.
Document type :
Conference papers
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03371917
Contributor : Pascale Sébillot Connect in order to contact the contributor
Submitted on : Saturday, October 9, 2021 - 9:54:09 AM
Last modification on : Friday, August 5, 2022 - 2:54:52 PM
Long-term archiving on: : Monday, January 10, 2022 - 6:07:56 PM

File

ranlp2021.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03371917, version 1

Citation

Cyrielle Mallart, Michel Le Nouy, Guillaume Gravier, Pascale Sébillot. Active Learning for Interactive Relation Extraction in a French Newspaper's Articles. RANLP 2021 - Recent Advances in Natural Language Processing, Sep 2021, Online, Bulgaria. pp.886-894. ⟨hal-03371917⟩

Share

Metrics

Record views

60

Files downloads

108