Multi-agent reinforcement learning for partially observable cooperative systems with acyclic dependence structure - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2024

Multi-agent reinforcement learning for partially observable cooperative systems with acyclic dependence structure

Résumé

Single-agent reinforcement learning algorithms can be directly applied to multiagent systems in an independent learning approach, but they then lose any convergence properties due to non-stationarity. We prove that in transition-independent Decentralized Partially Observable Decentralized Markov Decision Process (Dec-POMDP) non-stationarity can be mitigated by a multi-scale approach when the interdependence of agents dynamics can be represented by a directed acyclic graph (DAG). We propose a multi-scale Q-learning algorithm (MQL) where agents update local q-learning iterates at different timescales without communication and still converge. To this purpose, we first show that we can model the loss of information on the global state as a state-dependent Markovian noise. Then, we show that results from stochastic approximation theory can be used to prove the convergence of the MQL under partial state observability. Next, we give practical solutions to exploit knowledge about agent interaction to assign learning rates that ensure convergence, and propose a NetworkMQL algorithm that can achieve convergence in Network-Distributed POMDP (ND-POMDP). Finally, we validate both MQL and NetworkMQL on a wind farm control problem from the energy industry.
Fichier principal
Vignette du fichier
hal_version-core.pdf (662.1 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04560319 , version 1 (26-04-2024)

Identifiants

  • HAL Id : hal-04560319 , version 1

Citer

Claire Bizon Monroc, Ana Bušić, Donatien Dubuc, Jiamin Zhu. Multi-agent reinforcement learning for partially observable cooperative systems with acyclic dependence structure. 2024. ⟨hal-04560319⟩
0 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More