Memory Bandits: a Bayesian approach for the Switching Bandit Problem

Réda Alami; Odalric Maillard; Raphael Féraud

Communication Dans Un Congrès Année : 2017

Memory Bandits: a Bayesian approach for the Switching Bandit Problem

(1) , (2) , (1)

1
2

Réda Alami

Fonction : Auteur

Orange Labs [Lannion]

Odalric Maillard

Fonction : Auteur

Sequential Learning

Raphael Féraud

Fonction : Auteur

Orange Labs [Lannion]

Résumé

The Thompson Sampling exhibits excellent results in practice and it has been shown to be asymptotically optimal. The extension of Thompson Sampling algorithm to the Switching Multi-Armed Bandit problem, proposed in [13], is a Thompson Sampling equiped with a Bayesian online change point detector [1]. In this paper, we propose another extension of this approach based on a Bayesian aggregation framework. Experiments provide some evidences that in practice, the proposed algorithm compares favorably with the previous version of Thompson Sampling for the Switching Multi-Armed Bandit Problem, while it outperforms clearly other algorithms of the state-of-the-art.

Domaines

Statistiques [stat] Machine Learning [stat.ML]

Fichier principal

MemoryBandits_FinalVersion.pdf (935.43 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

REDA ALAMI : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01811697

Soumis le : mercredi 13 juin 2018-13:02:06

Dernière modification le : vendredi 24 mars 2023-14:53:07

Archivage à long terme le : vendredi 14 septembre 2018-19:55:31

Dates et versions

hal-01811697 , version 1 (13-06-2018)

Identifiants

HAL Id : hal-01811697 , version 1

Citer

Réda Alami, Odalric Maillard, Raphael Féraud. Memory Bandits: a Bayesian approach for the Switching Bandit Problem. NIPS 2017 - 31st Conference on Neural Information Processing Systems, Dec 2017, Long Beach, United States. ⟨hal-01811697⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LILLE3 CNRS INRIA LAGIS INRIA2

541 Consultations

670 Téléchargements

Memory Bandits: a Bayesian approach for the Switching Bandit Problem

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager