Quantitative propagation of chaos for mean field Markov decision process with common noise
Résumé
We investigate propagation of chaos for mean field Markov Decision Process with common noise (CMKV-MDP), and when the optimization is performed over randomized open-loop controls on infinite horizon. We first state a rate of convergence of order M_N^\gamma , where M_N is the mean rate of convergence in Wasserstein distance of the empirical measure, and γ \in (0,1] 1s is an explicit constant, in the limit of the value functions of N-agent control problem with asymmetric open-loop controls, towards the value function of CMKV-MDP. Furthermore, we show how to explicitly construct O(\epsilon + M_N^\gamma)-optimal policies for the N-agent model from \epsilon-optimal policies for the CMKV-MDP. Our approach relies on sharp comparison between the Bellman operators in the N-agent problem and the CMKV-MDP, and fine coupling of empirical measures.
Origine : Fichiers produits par l'(les) auteur(s)