P. Abbeel and A. Ng, Apprenticeship learning via inverse reinforcement learning, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015430

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.92

J. Bagnell, A. Ng, and J. Schneider, Solving uncertain Markov Decision Processes, Tech. rep., CMU, 2001.

J. Boger, J. Hoey, P. Poupart, C. Boutilier, G. Fernie et al., A Planning System Based on Markov Decision Processes to Guide People With Dementia Through Activities of Daily Living, IEEE Transactions on Information Technology in Biomedicine, vol.10, issue.2, 2006.
DOI : 10.1109/TITB.2006.864480

C. Boutilier, R. Das, J. O. Kephart, G. Tesauro, and W. E. Walsh, Cooperative Negotiation in Autonomic Systems Using Incremental Utility Elicitation, Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence, pp.89-97, 2003.

G. Casella and E. I. George, Explaining the Gibbs sampler, The American Statistician, vol.46, pp.167-174, 1992.
DOI : 10.2307/2685208

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.323.3219

E. Delage and S. Mannor, Percentile optimization in uncertain Markov decision processes with application to efficient exploration, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.225-232, 2007.
DOI : 10.1145/1273496.1273525

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.362.2796

R. Givan, S. Leach, and T. Dean, Bounded-parameter Markov decision processes, Artificial Intelligence, vol.122, issue.1-2, pp.71-109, 2000.
DOI : 10.1016/S0004-3702(00)00047-3

URL : http://doi.org/10.1016/s0004-3702(00)00047-3

B. Piot, M. Geist, and O. Pietquin, Boosted and Reward-regularized Classification for Apprenticeship Learning, 13th International Conference on Autonomous Agents and Multiagent Systems, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01107837

M. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994.
DOI : 10.1002/9780470316887

K. Regan and C. Boutilier, Regret-based Reward Elicitation for Markov Decision Processes, Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. pp. 444?451. UAI '09, 2009.

K. Regan and C. Boutilier, Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies, 2010.

K. Regan and C. Boutilier, Eliciting Additive Reward Functions for Markov Decision Processes, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence -Volume Volume Three, pp.2159-2164, 2011.

K. Regan and C. Boutilier, Robust Online Optimization of Reward-uncertain MDPs, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence -Volume Volume Three, pp.2165-2171, 2011.

S. Rosenthal and M. M. Veloso, Monte Carlo preference elicitation for learning additive reward functions, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, pp.886-891, 2012.
DOI : 10.1109/ROMAN.2012.6343863

A. Thomaz, G. Hoffman, and C. Breazeal, Real-Time Interactive Reinforcement Learning for Robots, AAAI Workshop Human Comprehensible Machine Learning, pp.9-13, 2005.

P. Weng, Markov Decision Processes with Ordinal Rewards: Reference Point- Based Preferences, Proc. of the 21st Inter. Conf. on Automated Planning and Scheduling, ICAPS 2011, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01285812

P. Weng, Ordinal Decision Models for Markov Decision Processes, ECAI 2012 -20th Eur. Conf. on Artificial Intelligence. Including Prestigious Applications of Artificial Intelligence (PAIS-2012) System Demonstrations Track, pp.828-833, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01273056

P. Weng and B. Zanuttini, Interactive Value Iteration for Markov Decision Processes with Unknown Rewards, IJCAI. IJCAI/AAAI, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00942290

D. J. White, Multi-objective infinite-horizon discounted Markov decision processes, Journal of Mathematical Analysis and Applications, vol.89, issue.2, 1982.
DOI : 10.1016/0022-247X(82)90122-6

URL : http://doi.org/10.1016/0022-247x(82)90122-6

H. Xu and S. Mannor, Parametric regret in uncertain Markov decision processes, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference, pp.3606-3613, 2009.
DOI : 10.1109/CDC.2009.5400796