Contextual Bandit for Active Learning: Active Thompson Sampling - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2014

Contextual Bandit for Active Learning: Active Thompson Sampling

Résumé

The labelling of training examples is a costly task in a supervised classi cation. Active learning strategies answer this problem by selecting the most useful unlabelled examples to train a predictive model. The choice of examples to label can be seen as a dilemma between the exploration and the exploitation over the data space representation. In this paper, a novel active learning strategy manages this compromise by modelling the active learning problem as a contextual bandit problem. We propose a sequential algorithm named Active Thompson Sampling (ATS), which, in each round, assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for this sample point label. Experimental comparison to previously proposed active learning algorithms show superior performance on a real application dataset.
Fichier principal
Vignette du fichier
Contextual_Bandit_for_Active_Learning.pdf (368.86 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01069802 , version 1 (29-09-2014)

Identifiants

  • HAL Id : hal-01069802 , version 1

Citer

Djallel Bouneffouf, Romain Laroche, Tanguy Urvoy, Raphael Féraud, Robin Allesiardo. Contextual Bandit for Active Learning: Active Thompson Sampling. 2014. ⟨hal-01069802⟩
2790 Consultations
6813 Téléchargements

Partager

Gmail Facebook X LinkedIn More