Speedy q-learning, Advances in Neural Information Processing Systems 24, pp.2411-2419, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00830140
Reinforcement learning with a near optimal rate of convergence ,
URL : https://hal.archives-ouvertes.fr/inria-00636615
REGAL: A regularization based algorithm for reinforcement learning in weakly communicating MDPs, Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, 2009. ,
Neuro-Dynamic Programming, Athena Scientific, 1996. ,
Reinforcement Learning and Dynamic Programming Using Function Approximators, 2010. ,
Prediction, Learning, and Games, 2006. ,
DOI : 10.1017/CBO9780511546921
Near-optimal regret bounds for reinforcement learning, Journal of Machine Learning Research, vol.11, pp.1563-1600, 2010. ,
On the Sample Complexity of Reinforcement Learning, 2004. ,
Finite-sample convergence rates for Q-learning and indirect algorithms, Advances in Neural Information Processing Systems 12, pp.996-1002, 1999. ,
The sample complexity of exploration in the multi-armed bandit problem, Journal of Machine Learning Research, vol.5, pp.623-648, 2004. ,
Influence and variance of a Markov chain: application to adaptive discretization in optimal control, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304), 1999. ,
DOI : 10.1109/CDC.1999.830188
The variance of discounted Markov decision processes, Journal of Applied Probability, vol.7, issue.04, pp.794-802, 1982. ,
DOI : 10.2307/1913656
Reinforcement learning in finite MDPs: PAC analysis, Journal of Machine Learning Research, vol.10, pp.2413-2444, 2009. ,
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Algorithms for Reinforcement Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, vol.4, issue.1, 2010. ,
DOI : 10.2200/S00268ED1V01Y201005AIM009
Model-based reinforcement learning with nearly tight exploration complexity bounds, Proceedings of the 27th International Conference on Machine Learning, pp.1031-1038, 2010. ,