SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning

Hannes Eriksson; Debabrota Basu; Mina Alibeigi; Christos Dimitrakakis

Communication Dans Un Congrès Année : 2022

SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning

(1, 2) , (3) , (2) , (4)

1
2
3
4

Hannes Eriksson

Fonction : Auteur

Chalmers University of Technology [Gothenburg, Sweden]

Zenseact AB

Debabrota Basu

Fonction : Auteur
PersonId : 742129
IdHAL : debabrota-basu

Scool

Mina Alibeigi

Fonction : Auteur

Zenseact AB

Christos Dimitrakakis

Fonction : Auteur
PersonId : 6538
IdHAL : christos-dimitrakakis
ORCID : 0000-0002-5367-5189

University of Oslo

Résumé

In this paper, we consider risk-sensitive sequential decision-making in Reinforcement Learning (RL). Our contributions are two-fold. First, we introduce a novel and coherent quantification of risk, namely composite risk, which quantifies the joint effect of aleatory and epistemic risk during the learning process. Existing works considered either aleatory or epistemic risk individually, or as an additive combination. We prove that the additive formulation is a particular case of the composite risk when the epistemic risk measure is replaced with expectation. Thus, the composite risk is more sensitive to both aleatory and epistemic uncertainty than the individual and additive formulations. We also propose an algorithm, SENTINEL-K, based on ensemble bootstrapping and distributional RL for representing epistemic and aleatory uncertainty respectively. The ensemble of K learners uses Follow The Regularised Leader (FTRL) to aggregate the return distributions and obtain the composite risk. We experimentally verify that SENTINEL-K estimates the return distribution better, and while used with composite risk estimates, demonstrates higher risk-sensitive performance than state-of-the-art risk-sensitive and distributional RL algorithms.

Domaines

Intelligence artificielle [cs.AI] Apprentissage [cs.LG] Calcul [stat.CO] Systèmes et contrôle [cs.SY] Applications [stat.AP]

Fichier principal

sentinel.pdf (269.33 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Debabrota Basu : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03150823

Soumis le : mercredi 24 février 2021-10:36:28

Dernière modification le : mercredi 24 janvier 2024-09:54:24

Archivage à long terme le : mardi 25 mai 2021-18:24:18

Dates et versions

hal-03150823 , version 1 (24-02-2021)

hal-03150823 , version 2 (06-09-2022)

Identifiants

HAL Id : hal-03150823 , version 1

Citer

Hannes Eriksson, Debabrota Basu, Mina Alibeigi, Christos Dimitrakakis. SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning. Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, Aug 2022, Eindhoven, Netherlands. pp.631-640. ⟨hal-03150823v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

77 Consultations

74 Téléchargements

SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager