Prioritized Sweeping Neural DynaQ with Multiple Predecessors, and Hippocampal Replays

Abstract : During sleep and awake rest, the hippocampus replays sequences of place cells that have been activated during prior experiences. These have been interpreted as a memory consolidation process, but recent results suggest a possible interpretation in terms of reinforcement learning. The Dyna reinforcement learning algorithms use off-line replays to improve learning. Under limited replay budget, a prioritized sweeping approach, which requires a model of the transitions to the predecessors, can be used to improve performance. We investigate whether such algorithms can explain the experimentally observed replays. We propose a neural network version of prioritized sweeping Q-learning, for which we developed a growing multiple expert algorithm, able to cope with multiple predecessors. The resulting architecture is able to improve the learning of simulated agents confronted to a navigation task. We predict that, in animals, learning the world model should occur during rest periods, and that the corresponding replays should be shuffled.
Type de document :
Pré-publication, Document de travail
2018
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01709275
Contributeur : Benoît Girard <>
Soumis le : mercredi 14 février 2018 - 17:56:05
Dernière modification le : vendredi 31 août 2018 - 09:13:02
Document(s) archivé(s) le : lundi 7 mai 2018 - 13:10:46

Fichiers

prioritized-sweeping-neural.pd...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01709275, version 1
  • ARXIV : 1802.05594

Collections

Citation

Lise Aubin, Mehdi Khamassi, Benoît Girard. Prioritized Sweeping Neural DynaQ with Multiple Predecessors, and Hippocampal Replays. 2018. 〈hal-01709275〉

Partager

Métriques

Consultations de la notice

95

Téléchargements de fichiers

37