Investigating the Image of Entities in Social Media: Dataset Design and First Results

Abstract : The objective of this paper is to describe the design of a dataset that deals with the image (i.e., representation, web reputation) of various entities populating the Internet: politicians, celebrities, companies, brands etc. Our main contribution is to build and provide an original annotated French dataset. This dataset consists of 11 527 manually annotated tweets expressing the opinion on specific facets (e.g., ethic, communication, economic project) describing two French policitians over time. We believe that other researchers might benefit from this experience, since designing and implementing such a dataset has proven quite an interesting challenge. This design comprises different processes such as data selection, formal definition and instantiation of an image. We have set up a full open-source annotation platform. In addition to the dataset design, we present the first results that we obtained by applying clustering methods to the annotated dataset in order to extract the entity images.
Complete list of metadatas

Cited literature [13 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02052420
Contributor : Julien Velcin <>
Submitted on : Thursday, February 28, 2019 - 2:25:54 PM
Last modification on : Wednesday, April 3, 2019 - 1:13:01 AM
Long-term archiving on : Wednesday, May 29, 2019 - 5:27:58 PM

File

LREC14_FINAL_VELCIN.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02052420, version 1

Citation

Julien Velcin, Caroline Brun, Jean-Yves Dormagen, Young-Min Kim, Claude Roux, et al.. Investigating the Image of Entities in Social Media: Dataset Design and First Results. 9th International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland. ⟨hal-02052420⟩

Share

Metrics

Record views

43

Files downloads

15