Reinforcement Learning with a Near Optimal Rate of Convergence, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00636615
Speedy q-learning, Advances in Neural Information Processing Systems 24, pp.2411-2419, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00830140
REGAL: A regularization based algorithm for reinforcement learning in weakly communicating MDPs Dynamic Programming and Optimal Control Neuro-Dynamic Programming, Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence Bertsekas DP Prediction, Learning, and Games, 1996. ,
Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems, Journal of Machine Learning Research, vol.7, pp.1079-1105, 2006. ,
A guided tour of chernoff bounds, Information Processing Letters, vol.33, issue.6, pp.305-308, 1990. ,
DOI : 10.1016/0020-0190(90)90214-I
Near-optimal regret bounds for reinforcement learning, Journal of Machine Learning Research, vol.11, pp.1563-1600, 2010. ,
On the sample complexity of reinforcement learning Gatsby Computational Neuroscience Unit Kearns M, Singh S (1999) Finite-sample convergence rates for Q-learning and indirect algorithms, Advances in Neural Information Processing Systems, pp.996-1002, 2004. ,
PAC Bounds for Discounted MDPs, p.3890, 2012. ,
DOI : 10.1007/978-3-642-34106-9_26
The sample complexity of exploration in the multiarmed bandit problem, Journal of Machine Learning Research, vol.5, pp.623-648, 2004. ,
Influence and variance of a Markov chain : Application to adaptive discretizations in optimal control An upper bound on the loss from approximate optimalvalue functions, Proceedings of the 38th IEEE Conference on Decision and Control Singh SP, pp.227-233, 1994. ,
The variance of discounted Markov decision processes, Journal of Applied Probability, vol.7, issue.04, pp.794-802, 1982. ,
DOI : 10.2307/1913656
Reinforcement learning in finite MDPs: PAC analysis, Journal of Machine Learning Research, vol.10, pp.2413-2444, 2009. ,
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Algorithms for Reinforcement Learning Szita I, Szepesvári C (2010) Model-based reinforcement learning with nearly tight exploration complexity bounds, Proceedings of the 27th International Conference on Machine Learning, Omnipress, pp.1031-1038, 2010. ,
Reinforcement Learning: State-of-the-Art, pp.3-39, 2012. ,
DOI : 10.1007/978-3-642-27645-3