Analyse d'une tâche de substitution lexicale : quelles sont les sources de difficulté ?

Abstract : This paper provides an analysis of the results of the SemDis 2014 evaluation campaign dedicated to a lexical substitution task in French. A gold standard has been established consisting of a dataset of 300 sentences, each of them associated with a list of substitutes that annotators proposed for a given target word. Our aim is to identify the main characteristics of this dataset that have an impact on human annotation and on the performance of the systems that have participated in the campaign. Our evaluation is based on the inter-annotator agreement scores and on the recall of the systems. We show that while several characteristics are found to have an impact on both aspects (level of rarity of the target word sense, frequency of the word), some are specific to the systems (degree of polysemy of the target word and characteristics pertaining to the sentence context).
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01362232
Contributor : Ludovic Tanguy <>
Submitted on : Thursday, September 8, 2016 - 1:49:12 PM
Last modification on : Tuesday, July 9, 2019 - 10:12:57 AM
Long-term archiving on : Friday, December 9, 2016 - 1:25:42 PM

File

final.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01362232, version 1

Collections

Citation

Ludovic Tanguy, Cécile Fabre, Camille Mercier. Analyse d'une tâche de substitution lexicale : quelles sont les sources de difficulté ?. TALN, Jul 2016, Paris, France. ⟨hal-01362232⟩

Share

Metrics

Record views

202

Files downloads

294