Skip to Main content Skip to Navigation
Conference papers

Infinitely many-armed bandits

Yizao Wang 1 Jean-Yves Audibert 2, 3, 4 Rémi Munos 5
2 imagine [Marne-la-Vallée]
CSTB - Centre Scientifique et Technique du Bâtiment, ENPC - École des Ponts ParisTech, ligm - Laboratoire d'Informatique Gaspard-Monge
4 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
5 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal, Inria Lille - Nord Europe
Abstract : We consider multi-armed bandit problems where the number of arms is larger than the possible number of experiments. We make a stochastic assumption on the mean-reward of a new selected arm which characterizes its probability of being a near-optimal arm. Our assumption is weaker than in previous works. We describe algorithms based on upper-confidence-bounds applied to a restricted set of randomly selected arms and provide upper-bounds on the resulting expected regret. We also derive a lower-bound which matches (up to a logarithmic factor) the upper-bound in some cases.
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download
Contributor : Rémi Munos <>
Submitted on : Tuesday, June 4, 2013 - 3:19:36 PM
Last modification on : Wednesday, February 26, 2020 - 7:06:12 PM
Document(s) archivé(s) le : Thursday, September 5, 2013 - 4:23:06 AM


Files produced by the author(s)


  • HAL Id : hal-00830178, version 1


Yizao Wang, Jean-Yves Audibert, Rémi Munos. Infinitely many-armed bandits. Advances in Neural Information Processing Systems, 2008, Canada. ⟨hal-00830178⟩



Record views


Files downloads