Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Adaptation to the Range in $K$-Armed Bandits

Hédi Hadiji 1, 2 Gilles Stoltz 1, 2
2 CELESTE - Statistique mathématique et apprentissage
Inria Saclay - Ile de France, LMO - Laboratoire de Mathématiques d'Orsay
Abstract : We consider stochastic bandit problems with $K$ arms, each associated with a bounded distribution supported on the range $[m,M]$. We do not assume that the range $[m,M]$ is known and show that there is a cost for learning this range. Indeed, a new trade-off between distribution-dependent and distribution-free regret bounds arises, which prevents from simultaneously achieving the typical $\ln T$ and \smash{$\sqrt{T}$} bounds. For instance, a \smash{$\sqrt{T}$} distribution-free regret bound may only be achieved if the distribution-dependent regret bounds are at least of order \smash{$\sqrt{T}$}. We exhibit a strategy achieving the rates for regret indicated by the new trade-off.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02794382
Contributor : Gilles Stoltz <>
Submitted on : Tuesday, November 10, 2020 - 11:17:23 AM
Last modification on : Saturday, November 14, 2020 - 3:33:35 AM

Files

Hadiji-Stoltz--Range-2020.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02794382, version 2
  • ARXIV : 2006.03378

Collections

Citation

Hédi Hadiji, Gilles Stoltz. Adaptation to the Range in $K$-Armed Bandits. 2020. ⟨hal-02794382v2⟩

Share

Metrics

Record views

10

Files downloads

20