Improved Exploration in Factored Average-Reward MDPs - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Improved Exploration in Factored Average-Reward MDPs

Résumé

We consider a regret minimization task under the average-reward criterion in an unknown Factored Markov Decision Process (FMDP). More specifically, we consider an FMDP where the stateaction space X and the state-space S admit the respective factored forms of X = ⊗ n i=1 X i and S = ⊗ m i=1 S i , and the transition and reward functions are factored over X and S. Assuming known factorization structure, we introduce a novel regret minimization strategy inspired by the popular UCRL2 strategy, called DBN-UCRL, which relies on Bernstein-type confidence sets defined for individual elements of the transition function. We show that for a generic factorization structure, DBN-UCRL achieves a regret bound, whose leading term strictly improves over existing regret bounds in terms of the dependencies on the size of S i 's and the involved diameter-related terms. We further show that when the factorization structure corresponds to the Cartesian product of some base MDPs, the regret of DBN-UCRL is upper bounded by the sum of regret of the base MDPs. We demonstrate, through numerical experiments on standard environments, that DBN-UCRL enjoys a substantially improved regret empirically over existing algorithms that have frequentist regret guarantees.
Fichier principal
Vignette du fichier
talebi21a.pdf (610.83 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03780564 , version 1 (19-09-2022)

Identifiants

  • HAL Id : hal-03780564 , version 1

Citer

Sadegh Talebi, Anders Jonsson, Odalric-Ambrym Maillard. Improved Exploration in Factored Average-Reward MDPs. 24th International Conference on Artificial Intelligence and Statistics, 2021, San diego (virtual), United States. ⟨hal-03780564⟩
19 Consultations
9 Téléchargements

Partager

Gmail Facebook X LinkedIn More