Regret lower bounds and extended Upper Confidence Bounds policies in stochastic multi-armed bandit problem

Antoine Salomon 1, * Jean-Yves Audibert 1, 2, 3 Issam El Alaoui 1
* Auteur correspondant
1 IMAGINE [Marne-la-Vallée]
LIGM - Laboratoire d'Informatique Gaspard-Monge, CSTB - Centre Scientifique et Technique du Bâtiment, ENPC - École des Ponts ParisTech
2 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit. A well-known result of Lai and Robbins, which has then been extended by Burnetas and Katehakis, has established the presence of a logarithmic bound for all consistent policies. We relax the notion of consistence, and exhibit a generalisation of the logarithmic bound. We also show the non existence of logarithmic bound in the general case of Hannan consistency. To get these results, we study variants of popular Upper Confidence Bounds (ucb) policies. As a by-product, we prove that it is impossible to design an adaptive policy that would select the best of two algorithms by taking advantage of the properties of the environment.
Type de document :
Pré-publication, Document de travail
2011
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-00652865
Contributeur : Antoine Salomon <>
Soumis le : vendredi 16 décembre 2011 - 14:48:03
Dernière modification le : mercredi 30 janvier 2019 - 11:08:31
Document(s) archivé(s) le : samedi 17 mars 2012 - 02:38:02

Fichiers

consistence.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00652865, version 1
  • ARXIV : 1112.3827

Citation

Antoine Salomon, Jean-Yves Audibert, Issam El Alaoui. Regret lower bounds and extended Upper Confidence Bounds policies in stochastic multi-armed bandit problem. 2011. 〈hal-00652865〉

Partager

Métriques

Consultations de la notice

965

Téléchargements de fichiers

724