Open Problem: Risk of Ruin in Multiarmed Bandits

We formalize a particular class of problems called survival multiarmed bandits (S-MAB), which constitutes a modified version of budgeted multiarmed bandits (B-MAB) where a true risk of ruin must be considered, bringing it closer to risk-averse multiarmed bandits (RA-MAB). In a S-MAB, pulling an arm can result in both positive and negative rewards. The agent has an initial budget that evolves in time with the received rewards. The goal is finding a good exploration-exploitationsafety trade-off, maximizing rewards while minimizing the probability of getting ruined (i.e. hitting a negative budget). Such simple and until now neglected modification in the MAB statement changes the way to approach the problem, asking for adapted algorithms and specific analytical tools, and also make it more likely related to some important real-world applications. We are interested in the following open problems which stem from such new MAB definition: (a) how can the regret be meaningfully defined in formal terms for a S-MAB given its multiobjective optimization nature? (b) can a S-MAB be reduced to a RA-MAB or a B-MAB, transferring their theoretical guarantees? (c) what kind of method or strategy must an agent follow to optimally solve a S-MAB?

Mots clés

Budgeted Multiarmed Bandits Ruin Theory Risk-Averse Decision Making

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

2019___COLT__CR____Risk_of_Ruin_in_Multiarmed_Bandits.pdf (146.34 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Laurent Vercouter : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02363609

Soumis le : mercredi 17 mars 2021-14:12:22

Dernière modification le : mercredi 20 mars 2024-16:26:03

Dates et versions

hal-02363609 , version 1 (17-03-2021)

Identifiants

HAL Id : hal-02363609 , version 1

Citer

Filipo Studzinski Perotto, Mathieu Bourgais, Laurent Vercouter, Bruno Castro da Silva. Open Problem: Risk of Ruin in Multiarmed Bandits. Conference on Learning Theory (COLT), Jun 2019, Phoenix, United States. ⟨hal-02363609⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INSA-ROUEN LITIS COMUE-NORMANDIE IDEES UNIROUEN UNILEHAVRE UNICAEN IRIHS INSA-GROUPE

44 Consultations

83 Téléchargements