Pure Exploration for Multi-Armed Bandit Problems - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2008

Pure Exploration for Multi-Armed Bandit Problems

Résumé

We consider the framework of stochastic multi-armed bandit problems and study the possibilities and limitations of strategies that explore sequentially the arms. The strategies are assessed not in terms of their cumulative regrets, as is usually the case, but through quantities referred to as simple regrets. The latter are related to the (expected) gains of the decisions that the strategies would recommend for a new one-shot instance of the same multi-armed bandit problem. Here, exploration is only constrained by the number of available rounds (not necessarily known in advance), in contrast to the case when cumulative regrets are considered and when exploitation needs to be performed at the same time. We start by indicating the links between simple and cumulative regrets. A small cumulative regret entails a small simple regret but too small a cumulative regret prevents the simple regret from decreasing exponentially towards zero, its optimal distribution-dependent rate. We therefore introduce specific strategies, for which we prove both distribution-dependent and distribution-free bounds. A concluding experimental study puts these theoretical bounds in perspective and shows the interest of non-uniform exploration of the arms.
Fichier principal
Vignette du fichier
PureExplo-HAL.pdf (139.83 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00257454 , version 1 (19-02-2008)
hal-00257454 , version 2 (12-06-2008)
hal-00257454 , version 3 (16-06-2008)
hal-00257454 , version 4 (19-02-2009)
hal-00257454 , version 5 (26-01-2010)
hal-00257454 , version 6 (08-06-2010)

Identifiants

Citer

Sébastien Bubeck, Rémi Munos, Gilles Stoltz. Pure Exploration for Multi-Armed Bandit Problems. 2008. ⟨hal-00257454v3⟩
850 Consultations
962 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More