Stationary Mixing Bandits - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2014

Stationary Mixing Bandits

Résumé

We study the bandit problem where arms are associated with stationary phi-mixing processes and where rewards are therefore dependent: the question that arises from this setting is that of recovering some independence by ignoring the value of some rewards. As we shall see, the bandit problem we tackle requires us to address the exploration/exploitation/independence trade-off. To do so, we provide a UCB strategy together with a general regret analysis for the case where the size of the independence blocks (the ignored rewards) is fixed and we go a step beyond by providing an algorithm that is able to compute the size of the independence blocks from the data. Finally, we give an analysis of our bandit problem in the restless case, i.e., in the situation where the time counters for all mixing processes simultaneously evolve.
Fichier principal
Vignette du fichier
mixingbandit-ARXIV.pdf (188.79 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01011112 , version 1 (23-06-2014)

Identifiants

Citer

Julien Audiffren, Liva Ralaivola. Stationary Mixing Bandits. 2014. ⟨hal-01011112⟩
201 Consultations
52 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More