Limits of Multi-Discounted Markov Decision Processes

Hugo Gimbert; Wieslaw Zielonka

doi:10.1109/LICS.2007.28

Communication Dans Un Congrès Année : 2007

Limits of Multi-Discounted Markov Decision Processes

(1) , (2)

1
2

Hugo Gimbert

Fonction : Auteur
PersonId : 6953
IdHAL : hugo-gimbert
ORCID : 0000-0003-1227-9718
IdRef : 113151918

Laboratoire Bordelais de Recherche en Informatique

Wieslaw Zielonka

Fonction : Auteur

Laboratoire d'informatique Algorithmique : Fondements et Applications

Résumé

Markov decision processes (MDPs) are controllable discrete event systems with stochastic transitions. The payoff received by the controller can be evaluated in different ways, depending on the payoff function the MDP is equipped with. For example a \emph{mean--payoff} function evaluates average performance, whereas a \emph{discounted} payoff function gives more weights to earlier performance by means of a discount factor. Another well--known example is the \emph{parity} payoff function which is used to encode logical specifications~\cite{dagstuhl}. Surprisingly, parity and mean--payoff MDPs share two non--trivial properties: they both have pure stationary optimal strategies~\cite{CourYan:1990,neyman} and they both are approximable by discounted MDPs with multiple discount factors (multi--discounted MDPs)~\cite{dealf:2003,neyman}. In this paper we unify and generalize these results. We introduce a new class of payoff functions called the priority weighted payoff functions, which are generalization of both parity and mean--payoff functions. We prove that priority weighted MDPs admit optimal strategies that are pure and stationary, and that the priority weighted value of an MDP is the limit of the multi--discounted value when discount factors tend to $0$ simultaneously at various speeds.

Domaines

Informatique et théorie des jeux [cs.GT]

Fichier principal

Limits_of_Multidiscounted_MDPs_Gimbert_Zielonka.pdf (312.08 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Hugo Gimbert : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00140148

Soumis le : jeudi 5 avril 2007-10:36:25

Dernière modification le : vendredi 24 mars 2023-14:52:48

Archivage à long terme le : mercredi 7 avril 2010-03:10:30

Dates et versions

hal-00140148 , version 1 (05-04-2007)

Identifiants

HAL Id : hal-00140148 , version 1
DOI : 10.1109/LICS.2007.28

Citer

Hugo Gimbert, Wieslaw Zielonka. Limits of Multi-Discounted Markov Decision Processes. LICS 07, Jul 2007, Wroclaw, Poland. pp.89-98, ⟨10.1109/LICS.2007.28⟩. ⟨hal-00140148⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-PARIS7 CNRS LIAFA

83 Consultations

130 Téléchargements

Limits of Multi-Discounted Markov Decision Processes

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager