Skip to Main content Skip to Navigation
Conference papers

Large-Scale Evaluation of Keyphrase Extraction Models

Abstract : Keyphrase extraction models are usually evaluated under different, not directly comparable, experimental setups. As a result, it remains unclear how well proposed models actually perform, and how they compare to each other. In this work, we address this issue by presenting a systematic large-scale analysis of state-of-the-art keyphrase extraction models involving multiple benchmark datasets from various sources and domains. Our main results reveal that state-of-the-art models are in fact still challenged by simple baselines on some datasets. We also present new insights about the impact of using author- or reader-assigned keyphrases as a proxy for gold standard, and give recommendations for strong baselines and reliable benchmark datasets.
Complete list of metadatas

Cited literature [51 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02878953
Contributor : Florian Boudin <>
Submitted on : Tuesday, June 23, 2020 - 1:36:20 PM
Last modification on : Wednesday, August 5, 2020 - 3:44:57 AM
Long-term archiving on: : Thursday, September 24, 2020 - 5:06:01 PM

File

large_scale_exp(1).pdf
Files produced by the author(s)

Identifiers

Citation

Ygor Gallina, Florian Boudin, Béatrice Daille. Large-Scale Evaluation of Keyphrase Extraction Models. ACM/IEEE Joint Conference on Digital Libraries (JCDL), Aug 2020, Wuhan, China. ⟨10.1145/1122445.1122456⟩. ⟨hal-02878953⟩

Share

Metrics

Record views

27

Files downloads

58