Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

Nicolas Gast; Bruno Gaujal; Jean-Yves Le Boudec

Rapport Année : 2010

Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

(1, 2) , (1) , (2)

1
2

Nicolas Gast

Fonction : Auteur
PersonId : 1247
IdHAL : nicolas-gast
ORCID : 0000-0001-6884-8698
IdRef : 233247874

Middleware efficiently scalable

Ecole Polytechnique Fédérale de Lausanne

Bruno Gaujal

Fonction : Auteur
PersonId : 11644
IdHAL : bruno-gaujal
ORCID : 0000-0001-9081-8401
IdRef : 074658441

Middleware efficiently scalable

Jean-Yves Le Boudec

Fonction : Auteur
PersonId : 868244

Ecole Polytechnique Fédérale de Lausanne

Résumé

We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Process. We give bounds on the difference of the rewards, and a constructive algorithm for deriving an approximating solution to the Markov Decision Process from a solution of the HJB equations. We illustrate the method on three examples pertaining respectively to investment strategies, population dynamics control and scheduling in queues are developed. They are used to illustrate and justify the construction of the controlled ODE and to show the gain obtained by solving a continuous HJB equation rather than a large discrete Bellman equation.

Mots clés

Mean Field Hamilton-Jacobi-Bellman Optimal Control Markov Decision Processes

Domaines

Performance et fiabilité [cs.PF] Optimisation et contrôle [math.OC] Probabilités [math.PR]

Fichier principal

RR_7239_MeanFieldMDP.pdf (383.24 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Nicolas Gast : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00473005

Soumis le : mardi 17 mai 2011-15:15:05

Dernière modification le : jeudi 4 avril 2024-21:14:34

Archivage à long terme le : vendredi 9 novembre 2012-11:35:44

Dates et versions

hal-00473005 , version 1 (14-04-2010)

hal-00473005 , version 2 (02-07-2010)

hal-00473005 , version 3 (17-05-2011)

Identifiants

HAL Id : hal-00473005 , version 3
ARXIV : 1004.2342

Citer

Nicolas Gast, Bruno Gaujal, Jean-Yves Le Boudec. Mean field for Markov Decision Processes: from Discrete to Continuous Optimization. 2010. ⟨hal-00473005v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG INRIA2 TDS-MACS LARA LIG_SIDCH

536 Consultations

594 Téléchargements

Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager