A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players

Etienne Boursier; Emilie Kaufmann; Abbas Mehrabian; Vianney Perchet

Communication Dans Un Congrès Année : 2020

A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players

(1) , (2, 3, 4, 5) , (6) , (7, 8)

1
2
3
4
5
6
7
8

Etienne Boursier

Fonction : Auteur
PersonId : 175818
IdHAL : etienne-boursier
ORCID : 0000-0002-7575-8575

Ecole Normale Supérieure Paris-Saclay

Emilie Kaufmann

Fonction : Auteur
PersonId : 10422
IdHAL : emilie-kaufmann
ORCID : 0000-0002-5496-824X
IdRef : 197040810

Centre National de la Recherche Scientifique

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Sequential Learning

Scool

Abbas Mehrabian

Fonction : Auteur

McGill University = Université McGill [Montréal, Canada]

Vianney Perchet

Fonction : Auteur
PersonId : 871881

Criteo [Paris]

École Nationale de la Statistique et de l'Administration Économique

Résumé

We study a multiplayer stochastic multi-armed bandit problem in which players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider the challenging heterogeneous setting, in which different arms may have different means for different players, and propose a new and efficient algorithm that combines the idea of leveraging forced collisions for implicit communication and that of performing matching eliminations. We present a finite-time analysis of our algorithm, giving the first sublinear minimax regret bound for this problem, and prove that if the optimal assignment of players to arms is unique, our algorithm attains the optimal O(ln(T )) regret, solving an open question raised at NeurIPS 2018 by Bistritz and Leshem (2018).

Mots clés

Distributed Learning Multi-armed bandit Combinatorial semi-bandits

Domaines

Machine Learning [stat.ML]

Fichier principal

aistats20.pdf (577.98 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Etienne Boursier : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02006069

Soumis le : mardi 3 mars 2020-10:55:21

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Archivage à long terme le : jeudi 4 juin 2020-14:02:21

Dates et versions

hal-02006069 , version 1 (04-02-2019)

hal-02006069 , version 2 (05-06-2019)

hal-02006069 , version 3 (03-03-2020)

Identifiants

HAL Id : hal-02006069 , version 3
ARXIV : 1902.01239

Citer

Etienne Boursier, Emilie Kaufmann, Abbas Mehrabian, Vianney Perchet. A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players. AISTATS 2020 - 23rd International Conference on Artificial Intelligence and Statistics, Aug 2020, Palermo, Italy. ⟨hal-02006069v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

GENES CNRS INRIA ENS-CACHAN ENSAE CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE IP_PARIS CRISTAL-SCOOL ANR ENS-PARIS-SACLAY

384 Consultations

374 Téléchargements

A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager