Contextual Bandit for Active Learning: Active Thompson Sampling

Djallel Bouneffouf; Romain Laroche; Tanguy Urvoy; Raphael Féraud; Robin Allesiardo

Pré-Publication, Document De Travail Année : 2014

Contextual Bandit for Active Learning: Active Thompson Sampling

(1, 2, 3) , (4) , (4) , (4) , (5, 4)

1
2
3
4
5

Djallel Bouneffouf

Fonction : Auteur
PersonId : 931776

Département Informatique

Services répartis, Architectures, MOdélisation, Validation, Administration des Réseaux

Centre National de la Recherche Scientifique

Romain Laroche

Fonction : Auteur

Orange Labs [Lannion]

Tanguy Urvoy

Fonction : Auteur

Orange Labs [Lannion]

Raphael Féraud

Fonction : Auteur

Orange Labs [Lannion]

Robin Allesiardo

Fonction : Auteur
PersonId : 4381
IdHAL : robin-allesiardo
IdRef : 197869483

Machine Learning and Optimisation

Orange Labs [Lannion]

Résumé

The labelling of training examples is a costly task in a supervised classi cation. Active learning strategies answer this problem by selecting the most useful unlabelled examples to train a predictive model. The choice of examples to label can be seen as a dilemma between the exploration and the exploitation over the data space representation. In this paper, a novel active learning strategy manages this compromise by modelling the active learning problem as a contextual bandit problem. We propose a sequential algorithm named Active Thompson Sampling (ATS), which, in each round, assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for this sample point label. Experimental comparison to previously proposed active learning algorithms show superior performance on a real application dataset.

Mots clés

Contextual Bandits Active learning Thompson sampling

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

Contextual_Bandit_for_Active_Learning.pdf (368.86 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Djallel Bouneffouf : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01069802

Soumis le : lundi 29 septembre 2014-19:58:39

Dernière modification le : lundi 12 février 2024-09:48:04

Archivage à long terme le : mardi 30 décembre 2014-11:50:29

Dates et versions

hal-01069802 , version 1 (29-09-2014)

Identifiants

HAL Id : hal-01069802 , version 1

Citer

Djallel Bouneffouf, Romain Laroche, Tanguy Urvoy, Raphael Féraud, Robin Allesiardo. Contextual Bandit for Active Learning: Active Thompson Sampling. 2014. ⟨hal-01069802⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM EC-PARIS CNRS INRIA TELECOM-SUDPARIS UMR8623 INRIA2 LRI-AO UNIV-PARIS-SACLAY

2790 Consultations

6813 Téléchargements

Contextual Bandit for Active Learning: Active Thompson Sampling

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager