On Bayesian index policies for sequential resource allocation

Emilie Kaufmann

Pré-Publication, Document De Travail Année : 2016

On Bayesian index policies for sequential resource allocation

(1, 2)

1
2

Emilie Kaufmann

Fonction : Auteur
PersonId : 10422
IdHAL : emilie-kaufmann
ORCID : 0000-0002-5496-824X
IdRef : 197040810

Sequential Learning

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Résumé

This paper is about index policies for minimizing (frequentist) regret in a stochastic multi-armed bandit model, that are inspired by a Bayesian view on the problem. Our main contribution is to prove the asymptotic optimality of Bayes-UCB, an algorithm based on quantiles of posterior distributions, when the rewards distributions belong to a one-dimensional exponential family, for a large class of prior distributions. We also show that the Bayesian literature gives new insight on what kind of exploration rates could be used in frequentist, UCB-type algorithms. Indeed, approximations of the Bayesian optimal solution or the Finite Horizon Gittins indices suggest the introduction of two algorithms, KL-UCB + and KL-UCB-H + , whose asymptotic optimality is also established.

Mots clés

Bayesian algorithms multi-armed bandits

Domaines

Machine Learning [stat.ML]

Fichier principal

Kaufmann16.pdf (483.86 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Emilie Kaufmann : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01251606

Soumis le : mercredi 6 janvier 2016-14:41:38

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Archivage à long terme le : jeudi 7 avril 2016-16:02:36

Dates et versions

hal-01251606 , version 1 (06-01-2016)

hal-01251606 , version 2 (12-09-2016)

hal-01251606 , version 3 (06-11-2017)

Identifiants

HAL Id : hal-01251606 , version 1
ARXIV : 1601.01190

Citer

Emilie Kaufmann. On Bayesian index policies for sequential resource allocation. 2016. ⟨hal-01251606v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

397 Consultations

257 Téléchargements

On Bayesian index policies for sequential resource allocation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager