Adaptation to the Range in $K$-Armed Bandits - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of Machine Learning Research Année : 2023

Adaptation to the Range in $K$-Armed Bandits

Résumé

We consider stochastic bandit problems with $K$ arms, each associated with a bounded distribution supported on the range $[m,M]$. We do not assume that the range $[m,M]$ is known and show that there is a cost for learning this range. Indeed, a new trade-off between distribution-dependent and distribution-free regret bounds arises, which prevents from simultaneously achieving the typical $\ln T$ and $\sqrt{T}$ bounds. For instance, a $\sqrt{T}$}distribution-free regret bound may only be achieved if the distribution-dependent regret bounds are at least of order $\sqrt{T}$. We exhibit a strategy achieving the rates for regret indicated by the new trade-off.
Fichier principal
Vignette du fichier
Hadiji-Stoltz--Range-2022-JMLR.pdf (1.31 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02794382 , version 1 (05-06-2020)
hal-02794382 , version 2 (10-11-2020)
hal-02794382 , version 3 (09-06-2022)

Licence

Paternité

Identifiants

Citer

Hédi Hadiji, Gilles Stoltz. Adaptation to the Range in $K$-Armed Bandits. Journal of Machine Learning Research, 2023, 24 (13), pp.1-33. ⟨hal-02794382v3⟩
277 Consultations
164 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More