Real-Time, Scalable, Content-based Twitter users recommendation - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Web Intelligence and Agent Systems Année : 2016

Real-Time, Scalable, Content-based Twitter users recommendation

Julien Subercaze
Christophe Gravier
Frederique Laforest

Résumé

Real-time recommendation of Twitter users based on the content of their profiles is a very challenging task. Traditional IR methods such as TF-IDF fail to handle efficiently large datasets. In this paper we present a scalable approach that allows real time recommendation of users based on their tweets. Our model builds a graph of terms, driven by the fact that users sharing similar interests will share similar terms. We show how this model can be encoded as a compact binary footprint, that allows very fast comparison and ranking, taking full advantage of modern CPU architectures. We validate our approach through an empirical evaluation against the Apache Lucene's implementation of TF-IDF. We show that our approach is in average two hundred times faster than standard optimized implementation of TF-IDF with a precision of 58 %.
Fichier principal
Vignette du fichier
document.pdf (1009.49 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01170244 , version 1 (26-08-2015)

Identifiants

  • HAL Id : hal-01170244 , version 1

Citer

Julien Subercaze, Christophe Gravier, Frederique Laforest. Real-Time, Scalable, Content-based Twitter users recommendation. Web Intelligence and Agent Systems, 2016, pp.17-29. ⟨hal-01170244⟩
231 Consultations
986 Téléchargements

Partager

Gmail Facebook X LinkedIn More