Memory Bandits: a Bayesian approach for the Switching Bandit Problem

Réda Alami 1 Odalric Maillard 2 Raphael Féraud 1
2 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal, Inria Lille - Nord Europe
Abstract : The Thompson Sampling exhibits excellent results in practice and it has been shown to be asymptotically optimal. The extension of Thompson Sampling algorithm to the Switching Multi-Armed Bandit problem, proposed in [13], is a Thompson Sampling equiped with a Bayesian online change point detector [1]. In this paper, we propose another extension of this approach based on a Bayesian aggregation framework. Experiments provide some evidences that in practice, the proposed algorithm compares favorably with the previous version of Thompson Sampling for the Switching Multi-Armed Bandit Problem, while it outperforms clearly other algorithms of the state-of-the-art.
Document type :
Conference papers
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01811697
Contributor : Reda Alami <>
Submitted on : Wednesday, June 13, 2018 - 1:02:06 PM
Last modification on : Thursday, February 21, 2019 - 10:52:49 AM
Long-term archiving on : Friday, September 14, 2018 - 7:55:31 PM

File

MemoryBandits_FinalVersion.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01811697, version 1

Collections

Citation

Réda Alami, Odalric Maillard, Raphael Féraud. Memory Bandits: a Bayesian approach for the Switching Bandit Problem. NIPS 2017 - 31st Conference on Neural Information Processing Systems, Dec 2017, Long Beach, United States. ⟨hal-01811697⟩

Share

Metrics

Record views

396

Files downloads

167