Classifying Micro-text Document Datasets: Application to Query Expansion of Crisis-Related Tweets - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Classifying Micro-text Document Datasets: Application to Query Expansion of Crisis-Related Tweets

Raj Ratn Pranesh
  • Fonction : Auteur
Javier A Espinosa-Oviedo

Résumé

Twitter is an active communication channel for spreading information during crises (e.g., earthquake). To exploit this information, civilians require to explore the tweets produced along a crisis period. For instance, for getting information about crisis' related events (e.g. landslide, building collapse), and their associated relief actions (e.g., gathering of food supply, search for victims). However, such Twitter usage demand significant effort and answers must be accurate to support the coordination of actions in response to crisis events (e.g., avoiding a massive concentration of efforts in only one place). This requirement calls for efficient information classification so that people can perform agile and useful relief actions. This paper introduces an approach based on classification and query expansion techniques in the context of micro-texts (i.e., tweets) search. In our approach, a user's query is rewritten using a classified vocabulary derived from top-k results, to reflect her search intent better. For classification purpose, we study and compare different models to find the one that can best provide answers to a user query. Our experimental results show that the use of Multi-Task Deep Neural Network (MT-DNN) models further improves micro-text classification. Also, the experimental results demonstrate that our query expansion method is effective and reduces noise in the expanded query terms when looking for crisis tweets on Twitter datasets.
Fichier principal
Vignette du fichier
STRAPS_2020_DataExploration.pdf (300.33 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03183462 , version 1 (27-03-2021)

Identifiants

  • HAL Id : hal-03183462 , version 1

Citer

Mehrdad Farokhnejad, Raj Ratn Pranesh, Javier A Espinosa-Oviedo. Classifying Micro-text Document Datasets: Application to Query Expansion of Crisis-Related Tweets. Proceedings of the Workshops of the ICSOC 2020 Joint Conference, Dec 2020, Dubai, United Arab Emirates. ⟨hal-03183462⟩
40 Consultations
113 Téléchargements

Partager

Gmail Facebook X LinkedIn More