V. Mnih, K. Kavukcuoglu, D. Silver, A. Rusu-andrei, J. Veness et al., Human-level control through deep reinforcement learning, Nature, vol.101, issue.7540, pp.529-533, 2015.
DOI : 10.1016/S0004-3702(98)00023-X

S. Levine, C. Finn, T. Darrell, and P. Abbeel, End-to-end Training of Deep Visuomotor Policies, J. Mach. Learn. Res, vol.17, issue.1, pp.1334-1373, 2016.

V. Kumar, E. Todorov, and S. Levine, Optimal control with learned local models: Application to dexterous manipulation, 2016 IEEE International Conference on Robotics and Automation (ICRA), pp.378-383
DOI : 10.1109/ICRA.2016.7487156

C. Finn, X. Y. Tan, Y. Duan, T. Darrell, S. Levine et al., Deep spatial autoencoders for visuomotor learning, 2016 IEEE International Conference on Robotics and Automation (ICRA), pp.512-519, 2016.
DOI : 10.1109/ICRA.2016.7487173

URL : http://arxiv.org/pdf/1509.06113

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez et al., Continuous control with deep reinforcement learning, 1509.

N. Heess, G. Wayne, D. Silver, T. P. Lillicrap, T. Erez et al., Learning Continuous Control Policies by Stochastic Value Gradients, NIPS, pp.2944-2952, 2015.

S. Gu, T. P. Lillicrap, I. Sutskever, and S. Levine, Continuous Deep Q-Learning with Model-based Acceleration, ICML, ser. JMLR Workshop and Conference Proceedings, pp.2829-2838, 2016.

C. Finn and S. Levine, Deep visual foresight for planning robot motion, 2017 IEEE International Conference on Robotics and Automation (ICRA), 2016.
DOI : 10.1109/ICRA.2017.7989324

URL : http://arxiv.org/pdf/1610.00696

S. Hawkins, H. He, G. J. Williams, and R. A. Baxter, Outlier Detection Using Replicator Neural Networks, DaWaK, ser, pp.170-180, 2002.
DOI : 10.1007/3-540-46145-0_17

URL : http://www.act.cmis.csiro.au/rohanb/PAPERS/dawak02.pdf

B. Schölkopf, R. C. Williamson, A. J. Smola, J. Shawe-taylor, and J. C. Platt, Support Vector Method for Novelty Detection, NIPS, pp.582-588, 1999.

A. Moreno, J. D. Martin, E. Soria, R. Magdalena, and M. Martinez, Noisy Reinforcements in reinforcement learning: some case studies based on gridworlds, WSEAS, pp.296-300, 2006.

R. Fox, A. Pakman, and N. Tishby, Taming the Noise in Reinforcement Learning via Soft Updates, UAI, 2016.

R. E. Bellman, Adaptive Control Processes: A Guided Tour, 1961.
DOI : 10.1515/9781400874668

S. Lange and M. Riedmiller, Deep auto-encoder neural networks in reinforcement learning, The 2010 International Joint Conference on Neural Networks (IJCNN), pp.1-8, 2010.
DOI : 10.1109/IJCNN.2010.5596468

URL : http://ml.informatik.uni-freiburg.de/_media/publications/langeijcnn2010.pdf

D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra et al., Deterministic Policy Gradient Algorithms, ICML, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00938992

M. J. Hausknecht and P. Stone, Deep Reinforcement Learning in Parameterized Action Space, CoRR, vol.abs, 1511.

D. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, ICLR, 2015.

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long et al., Caffe, Proceedings of the ACM International Conference on Multimedia, MM '14, 2014.
DOI : 10.1145/2647868.2654889

S. Ioffe and C. Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, ICML, pp.448-456, 2015.