Optimizing debt collections using constrained reinforcement learning, Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), 2010. ,
Constrained policy optimization. CoRR, abs/1705.10528, 2017. ,
Constrained Markov Decision Processes, 1999. ,
URL : https://hal.archives-ouvertes.fr/inria-00074109
Dynamic programming and lagrange multipliers, Proceedings of the National Academy of Sciences of the United States of America, 1956. ,
Optimal policies for controlled markov chains with a constraint, Journal of Mathematical Analysis and Applications, vol.112, issue.1, pp.236-252, 1985. ,
Budget allocation using weakly coupled, constrained markov decision processes, Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI), 2016. ,
Riskconstrained reinforcement learning with percentile risk criteria. CoRR, abs/1512.01629, 2015. ,
Tree-Based Batch Mode Reinforcement Learning, Journal of Machine Learning Research, 2005. ,
A Comprehensive Survey on Safe Reinforcement Learning, Journal of Machine Learning Research, 2015. ,
Risk-sensitive reinforcement learning applied to control under constraints, Journal of Artificial Intelligence, vol.24, pp.81-108, 2005. ,
An efficient algorithm for determining the convex hull of a finite planar set, Inf. Process. Lett, 1972. ,
Slsqp, a nonlinear programming method with quadratic programming subproblems, DLR, 1989. ,
Safe Policy Improvement with Baseline Bootstrapping, 2017. ,
Safe policy improvement by minimizing robust baseline regret, Advances in Neural Information Processing Systems (NIPS), 2016. ,