Adaptation to the Range in $K$-Armed Bandits

Hédi Hadiji; Gilles Stoltz

Journal Articles Journal of Machine Learning Research Year : 2023

Adaptation to the Range in $K$-Armed Bandits

(1, 2) , (1, 2)

1
2

Hédi Hadiji

Function : Author
PersonId : 175390
IdHAL : hedi-hadiji
ORCID : 0000-0001-8936-5054
IdRef : 252779525

Laboratoire de Mathématiques d'Orsay

Statistique mathématique et apprentissage

Gilles Stoltz

Function : Author
PersonId : 738739
IdHAL : gilles-stoltz
ORCID : 0000-0003-1240-1007
IdRef : 091575419

Laboratoire de Mathématiques d'Orsay

Statistique mathématique et apprentissage

Abstract

We consider stochastic bandit problems with $K$ arms, each associated with a bounded distribution supported on the range $[m,M]$. We do not assume that the range $[m,M]$ is known and show that there is a cost for learning this range. Indeed, a new trade-off between distribution-dependent and distribution-free regret bounds arises, which prevents from simultaneously achieving the typical $\ln T$ and $\sqrt{T}$ bounds. For instance, a $\sqrt{T}$}distribution-free regret bound may only be achieved if the distribution-dependent regret bounds are at least of order $\sqrt{T}$. We exhibit a strategy achieving the rates for regret indicated by the new trade-off.

Keywords

Multiarmed bandits Adversarial learning Cumulative regret Information-theoretic proof techniques multiarmed bandits adversarial learning cumulative regret information-theoretic proof techniques

Domains

Machine Learning [stat.ML] Statistics [math.ST]

Fichier principal

Hadiji-Stoltz--Range-2022-JMLR.pdf (1.31 Mo)

Origin : Files produced by the author(s)

Gilles Stoltz : Connect in order to contact the contributor

https://hal.science/hal-02794382

Submitted on : Thursday, June 9, 2022-7:58:01 AM

Last modification on : Friday, April 26, 2024-1:07:15 PM

Dates and versions

hal-02794382 , version 1 (05-06-2020)

hal-02794382 , version 2 (10-11-2020)

hal-02794382 , version 3 (09-06-2022)

Licence

Attribution

Identifiers

HAL Id : hal-02794382 , version 3
ARXIV : 2006.03378

Cite

Hédi Hadiji, Gilles Stoltz. Adaptation to the Range in $K$-Armed Bandits. Journal of Machine Learning Research, 2023, 24 (13), pp.1-33. ⟨hal-02794382v3⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA INSMI LM-ORSAY INRIA2 UNIV-PARIS-SACLAY GS-MATHEMATIQUES GS-COMPUTER-SCIENCE

278 View

166 Download

Adaptation to the Range in $K$-Armed Bandits

Abstract

Keywords

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share