INEX Tweet Contextualization Task: Evaluation, Results and Lesson Learned

Microblogging platforms such as Twitter are increasingly used for on-line client and market analysis. This motivated the proposal of a new track at CLEF INEX lab of Tweet Contextualization. The objective of this task was to help a user to understand a tweet by providing him with a short explanatory summary (500 words). This summary should be built automatically using resources like Wikipedia and generated by extracting relevant passages and aggregating them into a coherent summary. Running for four years, results show that the best systems combine NLP techniques with more traditional methods. More precisely the best performing systems combine passage retrieval, sentence segmentation and scoring, named entity recognition, text part-of-speech (POS) analysis, anaphora detection, diversity content measure as well as sentence reordering. This paper provides a full summary report on the four-year long task. While yearly overviews focused on system results, in this paper we provide a detailed report on the approaches proposed by the participants and which can be considered as the state of the art for this task. As an important result from the 4 years competition, we also describe the open access resources that have been built and collected. The evaluation measures for automatic summarization designed in DUC or MUC were not appropriate to evaluate tweet contextualization, we explain why and depict in detailed the LogSim measure used to evaluate informativeness of produced contexts or summaries. Finally, we also mention the lessons we learned and that it is worth considering when designing a task.

Mots clés

Short text contextualization tweet contextualization tweet understanding automatic summarization contextual information retrieval question answering focus information retrieval natural language processing Wikipedia text readability text informativeness textual references Kullback-Leibler divergence

Domaines

Informatique [cs]

Fichier principal

bellot_16939.pdf (259.26 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

William Domingues Vinhas : Connectez-vous pour contacter le contributeur

https://amu.hal.science/hal-01479297

Soumis le : mercredi 22 janvier 2020-10:15:56

Dernière modification le : mardi 5 décembre 2023-18:08:07

Archivage à long terme le : jeudi 23 avril 2020-14:00:18

Dates et versions

hal-01479297 , version 1 (22-01-2020)

Licence

Paternité - Pas d'utilisation commerciale - Pas de modification

Identifiants

HAL Id : hal-01479297 , version 1
DOI : 10.1016/j.ipm.2016.03.002

Citer

Patrice Bellot, Véronique Moriceau, Josiane Mothe, Eric San Juan, Xavier Tannier. INEX Tweet Contextualization Task: Evaluation, Results and Lesson Learned. Information Processing and Management, 2016, 52 (5), pp.801-819. ⟨10.1016/j.ipm.2016.03.002⟩. ⟨hal-01479297⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON UNIV-TLSE2 UNIV-TLN CNRS UNIV-AMU LIMSI SMS LSIS LSIS-DIMAG UT1-CAPITOLE UNIV-PARIS-SACLAY LIS-LAB SORBONNE-UNIVERSITE IRIT IRIT-SIG LISN GS-ENGINEERING IRIT-GD GS-COMPUTER-SCIENCE IRIT-UT2J IRIT-UT3 TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP HESAM IRENAV LAMPA LCPI LABOMAP LISPEN MSMP

385 Consultations

515 Téléchargements