HashGraph : an expressive and scalable Twitter users profile for recommendation - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2013

HashGraph : an expressive and scalable Twitter users profile for recommendation

Julien Subercaze
Christophe Gravier
Frédérique Laforest

Résumé

Microblogging websites such as Twitter produce tremendous amounts of data each second. Identifying people to follow is a heavy task that cannot be completely done by users. Consequently, real time recommendation systems require very efficient algorithm to quickly process this massive amount of data, so as to recommend users having similar interests. In this paper we present a tractable algorithm to build user profiles out of their tweets. We propose a scalable and extensible way of building content-based users profiles in real time. Scalability refers to the relative complexity of algorithms involved in building the users profiles with respect to state of the art solutions. Extensibility considers avoiding to recompute the model for newcomers. Our model is a graph of terms co-occurency, driven by the fact that users sharing similar interests will share similar terms. We show how this model can be encoded as a binary footprint, hence boosting comparison of profiles.We provide an empirical study to measure how the distance between users in the hash space differs from distance between users using standard Information Retrieval techniques. This experiment is based on a Twitter dataset we crawled, and represents 25K users and 1 million tweets. Our approach is driven by real time analysis requirements and is thus oriented on a trade-off between expressivity and efficiency. Experimental results shows that our approach outperforms vector space model by three orders of magnitude, with a precision of 58%.

Domaines

Web
Fichier principal
Vignette du fichier
wi2013_4_.pdf (189.65 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00973728 , version 1 (04-04-2014)

Identifiants

  • HAL Id : hal-00973728 , version 1

Citer

Julien Subercaze, Christophe Gravier, Frédérique Laforest. HashGraph : an expressive and scalable Twitter users profile for recommendation. 2013 IEEE/WIC/ACM International Conference on Web Intelligence (WI'13), Nov 2013, Atlanta, United States. pp.101-108. ⟨hal-00973728⟩
99 Consultations
285 Téléchargements

Partager

Gmail Facebook X LinkedIn More