Identification of Blackwell Policies for Deterministic MDPs

Victor Boone; Bruno Gaujal

Communication Dans Un Congrès Année : 2022

Identification of Blackwell Policies for Deterministic MDPs

(1) , (1)

Victor Boone

Fonction : Auteur
PersonId : 1128963

Laboratoire d'Informatique de Grenoble

Bruno Gaujal

Fonction : Auteur
PersonId : 11644
IdHAL : bruno-gaujal
ORCID : 0000-0001-9081-8401
IdRef : 074658441

Laboratoire d'Informatique de Grenoble

Résumé

We consider the problem of the identification of Blackwell optimal policies for deterministic finite Markov Decision Processes (d-MDPs). Specifically, we are interested in algorithms that learn reward distributions by querying samples over time, that stop almost surely and return a Blackwell optimal policy with high probability. We provide a characterization of the class of MDPs over which such algorithms exist together with an algorithm identifying Blackwell optimal policies with arbitrarly high probability.

Mots clés

Reinforcement Learning Markov Decision Processes Blackwell Optimality

Domaines

Recherche opérationnelle [math.OC] Combinatoire [math.CO]

Fichier principal

roadef.pdf (260.8 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

CCSD Sciencesconf.org : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03595301

Soumis le : jeudi 3 mars 2022-10:49:18

Dernière modification le : vendredi 5 avril 2024-03:14:06

Archivage à long terme le : samedi 4 juin 2022-18:37:42

Dates et versions

hal-03595301 , version 1 (03-03-2022)

Identifiants

HAL Id : hal-03595301 , version 1

Citer

Victor Boone, Bruno Gaujal. Identification of Blackwell Policies for Deterministic MDPs. 23ème congrès annuel de la Société Française de Recherche Opérationnelle et d'Aide à la Décision, INSA Lyon, Feb 2022, Villeurbanne - Lyon, France. ⟨hal-03595301⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS LIG TDS-MACS LIG_SIDCH ROADEF2022

42 Consultations

98 Téléchargements

Identification of Blackwell Policies for Deterministic MDPs

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager