Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, EpiSciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation

Multi-Bandit Best Arm Identification

Victor Gabillon 1 Mohammad Ghavamzadeh 1 Alessandro Lazaric 1 Sébastien Bubeck 2 
1 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal
Abstract : We study the problem of identifying the best arm in each of the bandits in a multi-bandit multi-armed setting. We first propose an algorithm called Gap-based Exploration (GapE) that focuses on the arms whose mean is close to the mean of the best arm in the same bandit (i.e., small gap). We then introduce an algorithm, called GapE-V, which takes into account the variance of the arms in addition to their gap. We prove an upper-bound on the probability of error for both algorithms. Since GapE and GapE-V need to tune an exploration parameter that depends on the complexity of the problem, which is often unknown in advance, we also introduce variation of these algorithms that estimates this complexity online. Finally, we evaluate the performance of these algorithms and compare them to other allocation strategies on a number of synthetic problems.
Document type :
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download
Contributor : Victor Gabillon Connect in order to contact the contributor
Submitted on : Saturday, November 19, 2011 - 3:11:40 PM
Last modification on : Thursday, January 20, 2022 - 4:16:24 PM
Long-term archiving on: : Monday, February 20, 2012 - 2:20:57 AM


Files produced by the author(s)


  • HAL Id : hal-00632523, version 3



Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric, Sébastien Bubeck. Multi-Bandit Best Arm Identification. 2011. ⟨hal-00632523v3⟩



Record views


Files downloads