Automatic Classification of Tweets for Analyzing Communication Behavior of Museums

Abstract : In this paper, we present a study on tweet classification which aims to define the communication behavior of the 103 French museums that participated in 2014 in the Twitter operation: MuseumWeek. The tweets were automatically classified in four communication categories: sharing experience, promoting participation, interacting with the community, and promoting-informing about the institution. Our classification is multi-class. It combines Support Vector Machines and Naive Bayes methods and is supported by a selection of eighteen subtypes of features of four different kinds: metadata information, punctuation marks, tweet-specific and lexical features. It was tested against a corpus of 1,095 tweets manually annotated by two experts in Natural Language Processing and Information Communication and twelve Community Managers of French museums. We obtained an state-of-the-art result of F1-score of 72% by 10-fold cross-validation. This result is very encouraging since is even better than some state-of-the-art results found in the tweet classification literature.
Liste complète des métadonnées

Cited literature [32 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01758645
Contributor : Antoine Courtin <>
Submitted on : Wednesday, April 4, 2018 - 4:25:01 PM
Last modification on : Monday, March 18, 2019 - 4:21:46 PM

File

LREC2016_FoucaultCourtin_v-fin...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01758645, version 1

Citation

Nicolas Foucault, Antoine Courtin. Automatic Classification of Tweets for Analyzing Communication Behavior of Museums. Tenth International Conference on Language Resources and Evaluation (LREC 2016), May 2016, Portorož, Slovenia. ⟨hal-01758645⟩

Share

Metrics

Record views

69

Files downloads

53