Skip to Main content Skip to Navigation
Conference papers

Automatic Classification of Tweets for Analyzing Communication Behavior of Museums

Abstract : In this paper, we present a study on tweet classification which aims to define the communication behavior of the 103 French museums that participated in 2014 in the Twitter operation: MuseumWeek. The tweets were automatically classified in four communication categories: sharing experience, promoting participation, interacting with the community, and promoting-informing about the institution. Our classification is multi-class. It combines Support Vector Machines and Naive Bayes methods and is supported by a selection of eighteen subtypes of features of four different kinds: metadata information, punctuation marks, tweet-specific and lexical features. It was tested against a corpus of 1,095 tweets manually annotated by two experts in Natural Language Processing and Information Communication and twelve Community Managers of French museums. We obtained an state-of-the-art result of F1-score of 72% by 10-fold cross-validation. This result is very encouraging since is even better than some state-of-the-art results found in the tweet classification literature.
Complete list of metadata

Cited literature [32 references]  Display  Hide  Download
Contributor : Antoine Courtin <>
Submitted on : Wednesday, April 4, 2018 - 4:25:01 PM
Last modification on : Thursday, December 10, 2020 - 12:32:05 PM


Files produced by the author(s)


  • HAL Id : hal-01758645, version 1



Nicolas Foucault, Antoine Courtin. Automatic Classification of Tweets for Analyzing Communication Behavior of Museums. Tenth International Conference on Language Resources and Evaluation (LREC 2016), May 2016, Portorož, Slovenia. ⟨hal-01758645⟩



Record views


Files downloads