# Stochastic bandits with vector losses: Minimizing $\ell^\infty$-norm of relative losses

1 Scool - Scool
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
2 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
Abstract : Multi-armed bandits are widely applied in scenarios like recommender systems, for which the goal is to maximize the click rate. However, more factors should be considered, e.g., user stickiness, user growth rate, user experience assessment, etc. In this paper, we model this situation as a problem of K-armed bandit with multiple losses. We define relative loss vector of an arm where the i-th entry compares the arm and the optimal arm with respect to the i-th loss. We study two goals: (a) finding the arm with the minimum $\ell^\infty$-norm of relative losses with a given confidence level (which refers to fixed-confidence best-arm identification); (b) minimizing the $\ell^\infty$-norm of cumulative relative losses (which refers to regret minimization). For goal (a), we derive a problem-dependent sample complexity lower bound and discuss how to achieve matching algorithms. For goal (b), we provide a regret lower bound of Ω(T 2/3) and provide a matching algorithm.
Document type :
Preprints, Working Papers, ...

Cited literature [46 references]

https://hal.archives-ouvertes.fr/hal-02968536
Contributor : Xuedong Shang Connect in order to contact the contributor
Submitted on : Thursday, October 15, 2020 - 6:36:49 PM
Last modification on : Thursday, January 20, 2022 - 5:28:57 PM

### File

shang2020vector.pdf
Files produced by the author(s)

### Identifiers

• HAL Id : hal-02968536, version 1

### Citation

Xuedong Shang, Han Shao, Jian Qian. Stochastic bandits with vector losses: Minimizing $\ell^\infty$-norm of relative losses. 2020. ⟨hal-02968536⟩

### Metrics

Les métriques sont temporairement indisponibles