The expected total cost criterion for Markov decision processes under constraints: a convex analytic approach

François Dufour; Masayuki Horiguchi; Alexei Piunovskiy

Article Dans Une Revue Advances in Applied Probability Année : 2012

The expected total cost criterion for Markov decision processes under constraints: a convex analytic approach

(1, 2) , (3) , (4)

1
2
3
4

François Dufour

Fonction : Auteur
PersonId : 12044
IdHAL : francois-dufour
ORCID : 0000-0001-6653-2024
IdRef : 127261680

Institut de Mathématiques de Bordeaux

Quality control and dynamic reliability

Masayuki Horiguchi

Fonction : Auteur
PersonId : 933467

Department of Mathematics

Alexei Piunovskiy

Fonction : Auteur
PersonId : 838778

Department of Mathematical Sciences [Liverpool]

Résumé

This paper deals with discrete-time Markov Decision Processes (MDP's) under constraints where all the objectives have the same form of an expected total cost over the infinite time horizon. The existence of an optimal control policy is discussed by using the convex analytic approach. We work under the assumptions that the state and action spaces are general Borel spaces and the model is non-negative, semi-continuous and there exists an admissible solution with finite cost for the associated linear program. It is worth noting that, in contrast with the classical results of the literature, our hypotheses do not require the MDP to be transient or absorbing. Our first result ensures the existence of an optimal solution to the linear program given by an occupation measure of the process generated by a randomized stationary policy. Moreover, it is shown that this randomized stationary policy provides an optimal solution to this Markov control problem. As a consequence, these results imply that the set of randomized stationary policies is a sufficient set for this optimal control problem. Finally, our last main result states that all optimal solutions of the linear program coincide on a special set with an optimal occupation measure generated by a randomized stationary policy. Several examples are presented to illustrate some theoretical issues and the possible applications of the results developed in the paper.

Domaines

Optimisation et contrôle [math.OC]

François Dufour : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00759717

Soumis le : dimanche 2 décembre 2012-14:40:47

Dernière modification le : jeudi 4 avril 2024-03:07:53

Dates et versions

hal-00759717 , version 1 (02-12-2012)

Identifiants

HAL Id : hal-00759717 , version 1

Citer

François Dufour, Masayuki Horiguchi, Alexei Piunovskiy. The expected total cost criterion for Markov decision processes under constraints: a convex analytic approach. Advances in Applied Probability, 2012, 44 (3), pp.774-793. ⟨hal-00759717⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA IMB INRIA2 TDS-MACS

142 Consultations

0 Téléchargements

The expected total cost criterion for Markov decision processes under constraints: a convex analytic approach

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager