Covariance-adapting algorithm for semi-bandits with application to sparse outcomes

Pierre Perrault; Vianney Perchet; Michal Valko

Communication Dans Un Congrès Année : 2020

Covariance-adapting algorithm for semi-bandits with application to sparse outcomes

(1, 2, 3) , (4) , (5)

1
2
3
4
5

Pierre Perrault

Fonction : Auteur
PersonId : 1073476

Ecole Normale Supérieure Paris-Saclay

Sequential Learning

Scool

Vianney Perchet

Fonction : Auteur
PersonId : 871881

Ecole Nationale de la Statistique et de l'Analyse Economique

Michal Valko

Fonction : Auteur
PersonId : 284
IdHAL : michal
IdRef : 22360934X

DeepMind [Paris]

Résumé

We investigate stochastic combinatorial semi-bandits, where the entire joint distribution of outcomes impacts the complexity of the problem instance (unlike in the standard bandits). Typical distributions considered depend on specific parameter values, whose prior knowledge is required in theory but quite difficult to estimate in practice; an example is the commonly assumed sub-Gaussian family. We alleviate this issue by instead considering a new general family of sub-exponential distributions, which contains bounded and Gaussian ones. We prove a new lower bound on the regret on this family, that is parameterized by the unknown covariance matrix, a tighter quantity than the sub-Gaussian matrix. We then construct an algorithm that uses covariance estimates, and provide a tight asymptotic analysis of the regret. Finally, we apply and extend our results to the family of sparse outcomes, which has applications in many recommender systems.

Mots clés

combinatorial stochastic semi-bandits covariance confidence ellipsoid sparsity

Domaines

Mathématiques [math] Statistiques [math.ST] Informatique [cs] Apprentissage [cs.LG]

Fichier principal

colt.pdf (660.04 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

pierre perrault : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02876102

Soumis le : samedi 20 juin 2020-15:39:30

Dernière modification le : vendredi 5 avril 2024-09:33:21

Dates et versions

hal-02876102 , version 1 (20-06-2020)

Identifiants

HAL Id : hal-02876102 , version 1

Citer

Pierre Perrault, Vianney Perchet, Michal Valko. Covariance-adapting algorithm for semi-bandits with application to sparse outcomes. Conference on Learning Theory, 2020, Graz, Austria. ⟨hal-02876102⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA ENS-CACHAN CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE CRISTAL-SCOOL ANR ENS-PARIS-SACLAY

95 Consultations

183 Téléchargements

Covariance-adapting algorithm for semi-bandits with application to sparse outcomes

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager