REGAL: A regularization based algorithm for reinforcement learning in weakly-communicating MDPs, UAI 2009, Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pp.35-42, 2009. ,
R-max -a general polynomial time algorithm for near-optimal reinforcement learning, Journal of Machine Learning Research, vol.3, pp.213-231, 2003. ,
Feature Reinforcement Learning: Part I. Unstructured MDPs, Journal of Artificial General Intelligence, vol.1, issue.1, pp.3-24, 2009. ,
DOI : 10.2478/v10229-011-0002-8
Near-optimal regret bounds for reinforcement learning, Journal of Machine Learning Research, vol.99, pp.1563-1600, 2010. ,
Near-optimal reinforcement learning in polynomial time, Machine Learning, pp.209-232, 2002. ,
Selecting the state-representation in reinforcement learning, Advances in Neural Information Processing Systems, pp.2627-2635, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00639483
Reinforcement Learning with Selective Perception and Hidden State, 1996. ,
Online regret bounds for undiscounted continuous reinforcement learning, Advances in Neural Information Processing Systems, pp.1772-1780, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00765441
Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994. ,
DOI : 10.1002/9780470316887
On the possibility of learning in reactive environments with arbitrary dependence, Theoretical Computer Science, vol.405, issue.3, pp.274-284, 2008. ,
DOI : 10.1016/j.tcs.2008.06.039
URL : https://hal.archives-ouvertes.fr/hal-00639569
Predictive state representations: A new theory for modeling dynamical systems, UAI '04, Proceedings of the 20th Conference in Uncertainty in Artificial Intelligence, pp.512-518, 2004. ,
PAC model-free reinforcement learning, Proceedings of the 23rd international conference on Machine learning , ICML '06, pp.881-888, 2006. ,
DOI : 10.1145/1143844.1143955
A Monte-Carlo AIXI approximation, Journal of Artificial Intelligence Research, vol.40, issue.1, pp.95-142, 2011. ,
Probabilistic finite-state machines - part I, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, issue.7, pp.1013-1025, 2005. ,
DOI : 10.1109/TPAMI.2005.147
URL : https://hal.archives-ouvertes.fr/ujm-00326243