Prioritized Sweeping Neural DynaQ with Multiple Predecessors, and Hippocampal Replays

Abstract : During sleep and awake rest, the hippocampus replays sequences of place cells that have been activated during prior experiences. These have been interpreted as a memory consolidation process, but recent results suggest a possible interpretation in terms of reinforcement learning. The Dyna reinforcement learning algorithms use off-line replays to improve learning. Under limited replay budget, a prioritized sweeping approach, which requires a model of the transitions to the predecessors, can be used to improve performance. We investigate whether such algorithms can explain the experimentally observed replays. We propose a neural network version of prioritized sweeping Q-learning, for which we developed a growing multiple expert algorithm, able to cope with multiple predecessors. The resulting architecture is able to improve the learning of simulated agents confronted to a navigation task. We predict that, in animals, learning the world model should occur during rest periods, and that the corresponding replays should be shuffled.
Type de document :
Pré-publication, Document de travail
Liste complète des métadonnées
Contributeur : Benoît Girard <>
Soumis le : mercredi 14 février 2018 - 17:56:05
Dernière modification le : vendredi 31 août 2018 - 09:13:02
Document(s) archivé(s) le : lundi 7 mai 2018 - 13:10:46


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-01709275, version 1
  • ARXIV : 1802.05594



Lise Aubin, Mehdi Khamassi, Benoît Girard. Prioritized Sweeping Neural DynaQ with Multiple Predecessors, and Hippocampal Replays. 2018. 〈hal-01709275〉



Consultations de la notice


Téléchargements de fichiers