WhichStreams: A Dynamic Approach for Focused Data Capture from Large Social Media

Abstract : Due to the huge amount of data produced on large social media, capturing useful content usually implies to focus on subsets of data that fit with a pre-specified need. Considering the usual API restrictions of these media, we formulate this task of focused capture as a dynamic data sources selection problem. We then propose a machine learning methodology, named WhichStreams, which is based on an extension of a recently proposed combinatorial bandit algorithm. The evaluation of our approach on various Twitter datasets, with both offline and online settings, demonstrates the relevance of the proposal for leveraging the real-time data streaming APIs offered by most of the main social media.
Liste complète des métadonnées

Cited literature [14 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01355397
Contributor : Thibault Gisselbrecht <>
Submitted on : Tuesday, August 23, 2016 - 1:43:40 PM
Last modification on : Thursday, March 21, 2019 - 2:18:08 PM

Identifiers

  • HAL Id : hal-01355397, version 1

Citation

Thibault Gisselbrecht, Patrick Gallinari, Sylvain Lamprier, Ludovic Denoyer. WhichStreams: A Dynamic Approach for Focused Data Capture from Large Social Media. Ninth International Conference on Web and Social Media, ICWSM 2015, May 2015, Oxford, United Kingdom. pp.130-139. ⟨hal-01355397⟩

Share

Metrics

Record views

402