Adaptation to the Range in $K$-Armed Bandits

Hédi Hadiji; Gilles Stoltz

Pré-Publication, Document De Travail Année : 2020

Adaptation to the Range in $K$-Armed Bandits

(1, 2) , (1, 2)

1
2

Hédi Hadiji

Fonction : Auteur
PersonId : 175390
IdHAL : hedi-hadiji
ORCID : 0000-0001-8936-5054
IdRef : 252779525

Laboratoire de Mathématiques d'Orsay

Statistique mathématique et apprentissage

Gilles Stoltz

Fonction : Auteur
PersonId : 738739
IdHAL : gilles-stoltz
ORCID : 0000-0003-1240-1007
IdRef : 091575419

Laboratoire de Mathématiques d'Orsay

Statistique mathématique et apprentissage

Résumé

We consider stochastic bandit problems with $K$ arms, each associated with a bounded distribution supported on the range $[m,M]$. We do not assume that the range $[m,M]$ is known and show that there is a cost for learning this range. Indeed, a new trade-off between distribution-dependent and distribution-free regret bounds arises, which, for instance, prevents from simultaneously achieving the typical $\ln T$ and \smash{$\sqrt{T}$} bounds. For instance, a \smash{$\sqrt{T}$} distribution-free regret bound may only be achieved if the distribution-dependent regret bounds are at least of order \smash{$\sqrt{T}$}. We exhibit a strategy achieving the rates for regret indicated by the new trade-off.

Domaines

Machine Learning [stat.ML] Statistiques [math.ST]

Fichier principal

Hadiji-Stoltz--Range-2020.pdf (2.21 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Gilles Stoltz : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02794382

Soumis le : vendredi 5 juin 2020-11:10:52

Dernière modification le : mardi 12 mars 2024-07:10:03

Dates et versions

hal-02794382 , version 1 (05-06-2020)

hal-02794382 , version 2 (10-11-2020)

hal-02794382 , version 3 (09-06-2022)

Identifiants

HAL Id : hal-02794382 , version 1
ARXIV : 2006.03378

Citer

Hédi Hadiji, Gilles Stoltz. Adaptation to the Range in $K$-Armed Bandits. 2020. ⟨hal-02794382v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

277 Consultations

164 Téléchargements

Adaptation to the Range in $K$-Armed Bandits

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager