WhichStreams: A Dynamic Approach for Focused Data Capture from Large Social Media

Abstract : Due to the huge amount of data produced on large social media, capturing useful content usually implies to focus on subsets of data that fit with a pre-specified need. Considering the usual API restrictions of these media, we formulate this task of focused capture as a dynamic data sources selection problem. We then propose a machine learning methodology, named WhichStreams, which is based on an extension of a recently proposed combinatorial bandit algorithm. The evaluation of our approach on various Twitter datasets, with both offline and online settings, demonstrates the relevance of the proposal for leveraging the real-time data streaming APIs offered by most of the main social media.
Type de document :
Communication dans un congrès
Ninth International Conference on Web and Social Media, ICWSM 2015, May 2015, Oxford, United Kingdom. pp.130-139, 2015
Liste complète des métadonnées

Littérature citée [14 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01355397
Contributeur : Thibault Gisselbrecht <>
Soumis le : mardi 23 août 2016 - 13:43:40
Dernière modification le : jeudi 22 novembre 2018 - 14:31:19

Identifiants

  • HAL Id : hal-01355397, version 1

Citation

Thibault Gisselbrecht, Patrick Gallinari, Sylvain Lamprier, Ludovic Denoyer. WhichStreams: A Dynamic Approach for Focused Data Capture from Large Social Media. Ninth International Conference on Web and Social Media, ICWSM 2015, May 2015, Oxford, United Kingdom. pp.130-139, 2015. 〈hal-01355397〉

Partager

Métriques

Consultations de la notice

391