M. Azar, . Gheshlaghi, R. Munos, M. Ghavamzadeh, and H. J. Kappen, Speedy q-learning, Advances in Neural Information Processing Systems 24, pp.2411-2419, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00830140

M. Azar, . Gheshlaghi, R. Munos, M. Ghavamzadeh, and H. J. Kappen, Reinforcement learning with a near optimal rate of convergence
URL : https://hal.archives-ouvertes.fr/inria-00636615

P. L. Bartlett and A. Tewari, REGAL: A regularization based algorithm for reinforcement learning in weakly communicating MDPs, Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, 2009.

D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, 1996.

O. Bu¸sbu¸s, L. Babu?, K. , R. , D. Schutter et al., Reinforcement Learning and Dynamic Programming Using Function Approximators, 2010.

N. Cesa-bianchi and G. Lugosi, Prediction, Learning, and Games, 2006.
DOI : 10.1017/CBO9780511546921

T. Jaksch, R. Ortner, and P. Auer, Near-optimal regret bounds for reinforcement learning, Journal of Machine Learning Research, vol.11, pp.1563-1600, 2010.

S. M. Kakade, On the Sample Complexity of Reinforcement Learning, 2004.

M. Kearns and S. Singh, Finite-sample convergence rates for Q-learning and indirect algorithms, Advances in Neural Information Processing Systems 12, pp.996-1002, 1999.

S. Mannor and J. N. Tsitsiklis, The sample complexity of exploration in the multi-armed bandit problem, Journal of Machine Learning Research, vol.5, pp.623-648, 2004.

R. Munos and A. Moore, Influence and variance of a Markov chain: application to adaptive discretization in optimal control, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304), 1999.
DOI : 10.1109/CDC.1999.830188

M. J. Sobel, The variance of discounted Markov decision processes, Journal of Applied Probability, vol.7, issue.04, pp.794-802, 1982.
DOI : 10.2307/1913656

A. L. Strehl, L. Li, and M. L. Littman, Reinforcement learning in finite MDPs: PAC analysis, Journal of Machine Learning Research, vol.10, pp.2413-2444, 2009.

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192

C. Szepesvári, Algorithms for Reinforcement Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, vol.4, issue.1, 2010.
DOI : 10.2200/S00268ED1V01Y201005AIM009

I. Szita and C. Szepesvári, Model-based reinforcement learning with nearly tight exploration complexity bounds, Proceedings of the 27th International Conference on Machine Learning, pp.1031-1038, 2010.