Bandits Corrupted by Nature: Lower Bounds on Regret and Robust Optimistic Algorithm - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

Bandits Corrupted by Nature: Lower Bounds on Regret and Robust Optimistic Algorithm

Debabrota Basu
Odalric-Ambrym Maillard
Timothée Mathieu

Résumé

In this paper, we study the stochastic bandits problem with k unknown heavy-tailed and corrupted reward distributions or arms with time-invariant corruption distributions. At each iteration, the player chooses an arm. Given the arm, the environment returns an uncorrupted reward with probability 1−ε and an arbitrarily corrupted reward with probability ε. In our setting, the uncorrupted reward might be heavy-tailed and the corrupted reward might be unbounded. We prove a lower bound on the regret indicating that the corrupted and heavy-tailed bandits are strictly harder than uncorrupted or light-tailed bandits. We observe that the environments can be categorised into hardness regimes depending on the suboptimality gap ∆, variance σ, and corruption proportion ϵ. Following this, we design a UCB-type algorithm, namely HuberUCB, that leverages Huber's estimator for robust mean estimation. HuberUCB leads to tight upper bounds on regret in the proposed corrupted and heavy-tailed setting. To derive the upper bound, we prove a novel concentration inequality for Huber's estimator, which might be of independent interest.
Fichier principal
Vignette du fichier
main.pdf (3.18 Mo) Télécharger le fichier
article.zip (2.85 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03611816 , version 1 (17-03-2022)

Identifiants

  • HAL Id : hal-03611816 , version 1

Citer

Debabrota Basu, Odalric-Ambrym Maillard, Timothée Mathieu. Bandits Corrupted by Nature: Lower Bounds on Regret and Robust Optimistic Algorithm. 2022. ⟨hal-03611816⟩
33 Consultations
75 Téléchargements

Partager

Gmail Facebook X LinkedIn More