Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

Nicolas Gast 1, 2 Bruno Gaujal 1 Jean-Yves Le Boudec 2
1 MESCAL - Middleware efficiently scalable
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Process. We give bounds on the difference of the rewards, and a constructive algorithm for deriving an approximating solution to the Markov Decision Process from a solution of the HJB equations. We illustrate the method on three examples pertaining respectively to investment strategies, population dynamics control and scheduling in queues are developed. They are used to illustrate and justify the construction of the controlled ODE and to show the gain obtained by solving a continuous HJB equation rather than a large discrete Bellman equation.
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00473005
Contributeur : Nicolas Gast <>
Soumis le : mardi 17 mai 2011 - 15:15:05
Dernière modification le : jeudi 11 octobre 2018 - 08:48:02
Document(s) archivé(s) le : vendredi 9 novembre 2012 - 11:35:44

Fichier

RR_7239_MeanFieldMDP.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00473005, version 3
  • ARXIV : 1004.2342

Collections

Citation

Nicolas Gast, Bruno Gaujal, Jean-Yves Le Boudec. Mean field for Markov Decision Processes: from Discrete to Continuous Optimization. 2010. 〈hal-00473005v3〉

Partager

Métriques

Consultations de la notice

812

Téléchargements de fichiers

382