L. Arnold, A. Auger, N. Hansen, and Y. Ollivier, Information-geometric optimization algorithms: A unifying picture via invariance principles, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00601503

J. Buchli, F. Stulp, E. Theodorou, and S. Schaal, Learning variable impedance control, The International Journal of Robotics Research, vol.14, issue.7, pp.820-833, 2011.
DOI : 10.1177/0278364911402527

L. Busoniu, D. Ernst, B. D. Schutter, and R. Babuska, Cross-Entropy Optimization of Control Policies With Adaptive Basis Functions, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol.41, issue.1, pp.196-209, 2011.
DOI : 10.1109/TSMCB.2010.2050586

J. Fix and M. Geist, Monte-Carlo Swarm Policy Search, Symposium on Swarm Intelligence and Differential Evolution, p.2012
DOI : 10.1007/978-3-642-29353-5_9
URL : https://hal.archives-ouvertes.fr/hal-00695540

F. Gomez, J. Schmidhuber, and R. Miikkulainen, Accelerated neural evolution through cooperatively coevolved synapses, Journal of Machine Learning Research, vol.9, pp.937-965, 2008.

N. Hansen and A. Ostermeier, Completely Derandomized Self-Adaptation in Evolution Strategies, Evolutionary Computation, vol.9, issue.2, pp.159-195, 2001.
DOI : 10.1016/0004-3702(95)00124-7

V. Heidrich-meisner and C. Igel, Evolution Strategies for Direct Policy Search, Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X, pp.428-437, 2008.
DOI : 10.1007/978-3-540-87700-4_43

V. Heidrich-meisner and C. Igel, Similarities and differences between policy gradient methods and evolution strategies, 16th European Symposium on Artificial Neural Networks Proceedings, pp.149-154, 2008.

A. J. Ijspeert, J. Nakanishi, and S. Schaal, Movement imitation with nonlinear dynamical systems in humanoid robots, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292), 2002.
DOI : 10.1109/ROBOT.2002.1014739

S. Kalyanakrishnan and P. Stone, Characterizing reinforcement learning methods through parameterized learning problems, Machine Learning, pp.205-247, 2011.
DOI : 10.1007/s10994-011-5251-x
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.226.2972

H. Kappen, Path integrals and symmetry breaking for optimal control theory, Journal of Statistical Mechanics: Theory and Experiment, vol.2005, issue.11, p.11011, 2005.
DOI : 10.1088/1742-5468/2005/11/P11011
URL : http://arxiv.org/abs/physics/0505066

J. Kober and J. Peters, Policy search for motor primitives in robotics, Machine Learning, pp.171-203, 2011.

D. Marin and O. Sigaud, Towards fast and adaptive optimal control policies for robots: A direct policy search approach, Proceedings Robotica, pp.21-26, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00703755

D. Marin, J. Decock, L. Rigoux, and O. Sigaud, Learning cost-efficient control policies with XCSF, Proceedings of the 13th annual conference on Genetic and evolutionary computation, GECCO '11, pp.1235-1242, 2011.
DOI : 10.1145/2001576.2001743
URL : https://hal.archives-ouvertes.fr/hal-00703760

D. E. Moriarty, A. C. Schultz, and J. J. Grefenstette, Evolutionary algorithms for reinforcement learning, Journal of Artificial Intelligence (JAIR), vol.11, pp.241-276, 1999.

A. Y. Ng and M. I. Jordan, Pegasus: A policy search method for large mdps and pomdps, Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, pp.406-415, 2000.

J. Peters and S. Schaal, Applying the episodic natural actor-critic architecture to motor primitive learning, Proceedings of the 15th European Symposium on Artificial Neural Networks, pp.1-6, 2007.

J. Peters and S. Schaal, Reinforcement learning of motor skills with policy gradients Neural networks : the official journal of the International Neural Network Society, pp.682-97, 2008.

J. Peters and S. Schaal, Natural actor-critic, Neurocomputing, vol.7179, issue.703, pp.1180-1190, 2007.
DOI : 10.1016/j.neucom.2007.11.026

M. Riedmiller, J. Peters, and S. Schaal, Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp.254-261, 2007.
DOI : 10.1109/ADPRL.2007.368196

R. Rubinstein and D. Kroese, The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning, 2004.

T. Rückstiess, M. Felder, and J. Schmidhuber, State-dependent exploration for policy gradient methods, 19th European Conference on Machine Learning (ECML), 2010.

T. Rückstiess, F. Sehnke, T. Schaul, D. Wierstra, Y. Sun et al., Exploring parameter space in reinforcement learning, Paladyn. Journal of Behavioral Robotics, vol.1, pp.14-24, 2010.

J. Santamaría, R. Sutton, and A. Ram, Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces, Adaptive Behavior, vol.2, issue.5, pp.163-217, 1997.
DOI : 10.1177/105971239700600201

R. Stengel, Optimal Control and Estimation, 1994.

F. Stulp and O. Sigaud, Path integral policy improvement with covariance matrix adaptation, Proceedings of the 29th International Conference on Machine Learning (ICML), 2012.
URL : https://hal.archives-ouvertes.fr/hal-00789391

F. Stulp, E. Theodorou, and S. Schaal, Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation, IEEE Transactions on Robotics, vol.28, issue.6, 2012.
DOI : 10.1109/TRO.2012.2210294
URL : https://hal.archives-ouvertes.fr/hal-00766177

R. Sutton and A. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192

M. Tamosiumaite, B. Nemec, A. Ude, and F. Wörgötter, Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives. Robots and Autonomous Systems, pp.59910-922, 2011.

E. Theodorou, J. Buchli, and S. Schaal, A generalized path integral control approach to reinforcement learning, Journal of Machine Learning Research, vol.11, pp.3137-3181, 2010.

S. Whiteson and P. Stone, Evolutionary function approximation for reinforcement learning, Journal of Machine Learning Research, vol.7, pp.877-917, 2006.

D. Wierstra, T. Schaul, J. Peters, and J. Schmidhuber, Natural Evolution Strategies, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), 2008.
DOI : 10.1109/CEC.2008.4631255
URL : http://arxiv.org/abs/1106.4487

R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, pp.229-256, 1992.