V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness et al., Human-level control through deep reinforcement learning, Nature, vol.518, issue.7540, pp.529-533, 2015.
DOI : 10.1038/nature14236

J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, Trust region policy optimization, Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp.1889-1897, 2015.

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez et al., Continuous control with deep reinforcement learning, 2015.

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre et al., Mastering the game of go with deep neural networks and tree search, Nature, vol.529, issue.7587, pp.484-489, 2016.

A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang et al., Learning from simulated and unsupervised images through adversarial training, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol.3, 2017.
DOI : 10.1109/cvpr.2017.241

URL : http://arxiv.org/pdf/1612.07828

K. Bousmalis, A. Irpan, P. Wohlhart, Y. Bai, M. Kelcey et al., Using simulation and domain adaptation to improve efficiency of deep robotic grasping, Robotics and Automation (ICRA), 2018.

S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural computation, vol.9, issue.8, pp.1735-1780, 1997.

G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman et al., , 2016.

J. C. Zagal, J. Ruiz-del-solar, and P. Vallejos, Back to reality: Crossing the reality gap in evolutionary robotics, IFAC Proceedings Volumes, vol.37, pp.834-839, 2004.

J. Bongard and H. Lipson, Once more unto the breach: Co-evolving a robot and its simulator, Proceedings of the Ninth International Conference on the Simulation and Synthesis of Living Systems (ALIFE9), pp.57-62, 2004.

S. Koos, J. Mouret, and S. Doncieux, Crossing the reality gap in evolutionary robotics by promoting transferable controllers, Proceedings of the 12th annual conference on Genetic and evolutionary computation, pp.119-126, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00633927

J. C. Zagal and J. Ruiz-del-solar, Combining simulation and reality in evolutionary robotics, Journal of Intelligent and Robotic Systems, vol.50, issue.1, pp.19-39, 2007.

J. C. Higuera, D. Meger, and G. Dudek, Adapting learned robotics behaviours through policy adjustment, 2017 IEEE International Conference on, pp.5837-5843, 2017.

N. Jakobi, P. Husbands, and I. Harvey, Noise and the reality gap: The use of simulation in evolutionary robotics, European Conference on Artificial Life, pp.704-720, 1995.

A. Punjani and P. Abbeel, Deep learning helicopter dynamics models, Robotics and Automation (ICRA), 2015 IEEE International Conference on, pp.3223-3230, 2015.

J. Fu, S. Levine, and P. Abbeel, One-shot learning of manipulation skills with online dynamics adaptation and neural network priors, Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on, pp.4019-4026, 2016.

I. Mordatch, N. Mishra, C. Eppner, and P. Abbeel, Combining model-based policy search with online model learning for control of physical humanoids, IEEE International Conference on, pp.242-248, 2016.

A. Nagabandi, G. Kahn, R. S. Fearing, and S. Levine, Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning, 2017.

M. Deisenroth and C. E. Rasmussen, Pilco: A model-based and data-efficient approach to policy search, Proceedings of the 28th International Conference on machine learning (ICML-11), pp.465-472, 2011.

C. Finn, S. Levine, and P. Abbeel, Guided cost learning: Deep inverse optimal control via policy optimization, International Conference on Machine Learning, pp.49-58, 2016.

J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, How transferable are features in deep neural networks?, Advances in neural information processing systems, pp.3320-3328, 2014.

Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle et al., Domain-adversarial training of neural networks, Journal of Machine Learning Research, vol.17, issue.59, pp.1-35, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01624607

Y. Taigman, A. Polyak, and L. Wolf, Unsupervised cross-domain image generation, 2016.

E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell, Deep domain confusion: Maximizing for domain invariance, 2014.

S. James and E. Johns, 3d simulation for robot arm control with deep q-learning, 2016.

A. A. Rusu, M. Vecerik, T. Rothörl, N. Heess, R. Pascanu et al., Sim-to-real robot learning from pixels with progressive nets, 2016.

J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba et al., Domain randomization for transferring deep neural networks from simulation to the real world, 2017.

X. B. Peng, M. Andrychowicz, W. Zaremba, and P. Abbeel, Sim-to-real transfer of robotic control with dynamics randomization, 2017.

J. Hanna and P. Stone, Grounded action transformation for robot learning in simulation, Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI), 2017.

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, Proceedings of the International Conference on Learning Representations, 2015.

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, 2017.

E. Todorov, T. Erez, and Y. Tassa, Mujoco: A physics engine for model-based control, Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, pp.5026-5033, 2012.

M. Lapeyre, P. Rouanet, J. Grizou, S. Nguyen, F. Depraetre et al., Poppy project: open-source fabrication of 3d printed humanoid robot for science, education and art, Digital Intelligence, p.6, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01096338