Thematic Representation of Short Text Messages with Latent Topics: Application in the Twitter context

Abstract : —The amount of information exchanged over the Internet is continuously growing, taking the form of short text messages on microblogging platforms such as Twitter. Due to the limited size of these types of messages, their understanding may require to know the context of their occurrence. In this paper, we propose a higher-level representation of short text messages based on a thematic model obtained by a Latent Dirichlet Allocation (LDA). We propose to evaluate the effectiveness of this short text message representation by using it in the experimental setup of the INEX 2012 tweet contextu-alization task. This topic-based representation allows to extend the message vocabulary by searching a set of thematically-related words. Results demonstrated the interest of this topic-space based approach for the tweet contextualization task.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01319779
Contributor : Bibliothèque Universitaire Déposants Hal-Avignon <>
Submitted on : Monday, May 23, 2016 - 8:56:41 AM
Last modification on : Saturday, March 23, 2019 - 1:22:13 AM

Identifiers

  • HAL Id : hal-01319779, version 1

Collections

Citation

Mohamed Morchid, Richard Dufour, Georges Linarès. Thematic Representation of Short Text Messages with Latent Topics: Application in the Twitter context. PACLING 2013, Sep 2013, Tokyo, Japan. ⟨hal-01319779⟩

Share

Metrics

Record views

125