Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning), 1998. ,
Human-level control through deep reinforcement learning, Nature, vol.101, issue.7540, pp.529-533, 2015. ,
DOI : 10.1038/nature14236
Cognitive Developmental Robotics: A Survey, IEEE Transactions on Autonomous Mental Development, vol.1, issue.1, pp.1-44, 2009. ,
DOI : 10.1109/TAMD.2009.2021702
Linear least-squares algorithms for temporal difference learning, 1996. ,
Neural Fitted Q Iteration ??? First Experiences with a Data Efficient Neural Reinforcement Learning Method, In Lecture Notes in Computer Science, vol.3720, pp.317-328, 2005. ,
DOI : 10.1007/11564096_32
Improving the Rprop learning algorithm, International Symposium on Neural Computation, pp.115-121, 2000. ,
RPROP -A Fast Adaptive Learning Algorithm, International Symposium on Computer and Information Science VII, 1992. ,
Policy Gradient Methods for Reinforcement Learning with Function Approximation, Advances in Neural Information Processing Systems 12, pp.1057-1063, 1999. ,
Completely Derandomized Self-Adaptation in Evolution Strategies, Evolutionary Computation, vol.9, issue.2, pp.159-195, 2001. ,
DOI : 10.1016/0004-3702(95)00124-7
A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol.42, issue.6, pp.1291-1307, 2012. ,
DOI : 10.1109/TSMCC.2012.2218595
URL : https://hal.archives-ouvertes.fr/hal-00756747
Actor-Critic Algorithms, Neural Information Processing Systems, pp.1008-1014, 1999. ,
Reinforcement Learning in Continuous Action Spaces, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp.272-279, 2007. ,
DOI : 10.1109/ADPRL.2007.368199
Swing up control problem for the acrobot, IEEE Control Systems Magazine, vol.15, issue.1, pp.49-55, 1995. ,
Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp.254-261, 2007. ,
DOI : 10.1109/ADPRL.2007.368196
Open dynamics engine, 2005. ,
Reinforcement Learning in Continuous State and Action Spaces, Reinforcement Learning, pp.207-251, 2012. ,
DOI : 10.1007/978-3-642-27645-3_7