Toward a data efficient neural actor-critic

Matthieu Zimmer 1, 2 Yann Boniface 2 Alain Dutech 1
1 MAIA - Autonomous intelligent machine
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
2 CORTEX - Neuromimetic intelligence
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : A new off-policy, offline, model-free, actor-critic reinforcement learning algorithm dealing with continuous environments in both states and actions is presented. It addresses discrete time problems where the goal is to maximize the discounted sum of rewards using stationary policies. Our algorithm allows to trade-off between data-efficiency and scalability. The amount of a priori knowledge is kept low by: (1) using neural networks to learn both the critic and the actor, (2) not relying on initial trajectories provided by an expert, and (3) not depending on known goal states. Experimental results show better data-efficiency than 4 state-of-the-art algorithms on two benchmark environments.
Type de document :
Communication dans un congrès
European Workshop on Reinforcement Learning, Dec 2016, Barcelona, Spain. 2016, European Workshop on Reinforcement Learning. 〈https://ewrl.wordpress.com/〉
Liste complète des métadonnées

Littérature citée [30 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01413885
Contributeur : Matthieu Zimmer <>
Soumis le : dimanche 11 décembre 2016 - 18:06:29
Dernière modification le : mercredi 14 décembre 2016 - 01:12:28
Document(s) archivé(s) le : mardi 28 mars 2017 - 00:38:19

Fichier

ewrl13-2016-submission_7.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : hal-01413885, version 1

Citation

Matthieu Zimmer, Yann Boniface, Alain Dutech. Toward a data efficient neural actor-critic. European Workshop on Reinforcement Learning, Dec 2016, Barcelona, Spain. 2016, European Workshop on Reinforcement Learning. 〈https://ewrl.wordpress.com/〉. 〈hal-01413885〉

Partager

Métriques

Consultations de la notice

358

Téléchargements de fichiers

171