Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002. ,
DOI : 10.1023/A:1013689704352
Theory of the hypervolume indicator, Proceedings of the tenth ACM SIGEVO workshop on Foundations of genetic algorithms, FOGA '09, pp.87-102, 2009. ,
DOI : 10.1145/1527125.1527138
URL : https://hal.archives-ouvertes.fr/inria-00430540
Learning all optimal policies with multiple criteria, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.41-47, 2008. ,
DOI : 10.1145/1390156.1390162
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.140.2715
Consistency Modifications for Automatically Tuned Monte-Carlo Tree Search, LION4, pp.111-124, 2010. ,
DOI : 10.1007/978-3-642-13800-3_9
URL : https://hal.archives-ouvertes.fr/inria-00437146
SMS-EMOA: Multiobjective selection based on dominated hypervolume, European Journal of Operational Research, vol.181, issue.3, pp.1653-1669, 2007. ,
DOI : 10.1016/j.ejor.2006.08.008
On the Complexity of Computing the Hypervolume Indicator, IEEE Transactions on Evolutionary Computation, vol.13, issue.5, pp.1075-1082, 2009. ,
DOI : 10.1109/TEVC.2009.2015575
Combining expert, offline, transient and online knowledge in monte-carlo exploration, 2008. ,
Markov Decision Processes with Multiple Long-Run Average Objectives, FSTTCS Foundations of Software Technology and Theoretical Computer Science, vol.4855, pp.473-484, 2007. ,
DOI : 10.1007/978-3-540-77050-3_39
URL : http://arxiv.org/abs/1104.3489
Monte-Carlo Tree Search techniques in the game of kriegspiel, IJCAI'09, pp.474-479, 2009. ,
Bandit algorithms for tree search. arXiv preprint cs/0703062, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00150207
Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search, Proc. Computers and Games, pp.72-83, 2006. ,
DOI : 10.1007/978-3-540-75538-8_7
URL : https://hal.archives-ouvertes.fr/inria-00116992
Multi-objective optimization using evolutionary algorithms, pp.55-58, 2001. ,
A fast elitist non-dominated sorting genetic algorithm for multiobjective optimization: NSGA-II, PPSN VI, pp.849-858, 1917. ,
Scalable multi-objective optimization test problems, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600), pp.825-830, 2002. ,
DOI : 10.1109/CEC.2002.1007032
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.7531
The Measure of Pareto Optima Applications to Multi-objective Metaheuristics, EMO'03, pp.519-533, 2003. ,
DOI : 10.1007/3-540-36970-8_37
Multi-criteria reinforcement learning, ICML'98, pp.197-205, 1998. ,
Combining online and offline knowledge in UCT, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.273-280, 2007. ,
DOI : 10.1145/1273496.1273531
URL : https://hal.archives-ouvertes.fr/inria-00164003
The cma evolution strategy: a comparing review. Towards a new evolutionary computation, pp.75-102, 2006. ,
Bandit Based Monte-Carlo Planning, pp.282-293, 2006. ,
DOI : 10.1007/11871842_29
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.1296
Linear fitted-q iteration with multiple reward functions, Journal of Machine Learning Research, vol.13, pp.3253-3295, 2012. ,
Automatic Discovery of Ranking Formulas for Playing with Multi-armed Bandits, Recent Advances in Reinforcement Learning -9th European Workshop, pp.5-17, 2011. ,
DOI : 10.1007/978-3-642-29946-9_5
A geometric approach to multi-criterion reinforcement learning, Journal of Machine Learning Research, pp.325-360, 2004. ,
Monte-Carlo exploration for deterministic planning, IJCAI'09, pp.1766-1771, 2009. ,
Dynamic preferences in multi-criteria reinforcement learning, Proceedings of the 22nd international conference on Machine learning , ICML '05, 2005. ,
DOI : 10.1145/1102351.1102427
On the approximability of trade-offs and optimal access of Web sources, Proceedings 41st Annual Symposium on Foundations of Computer Science, pp.86-92, 2000. ,
DOI : 10.1109/SFCS.2000.892068
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Algorithms for Reinforcement Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, vol.4, issue.1, 2010. ,
DOI : 10.2200/S00268ED1V01Y201005AIM009
Managing power consumption and performance of computing systems using reinforcement learning, NIPS'07, pp.1-8, 2007. ,
NP-complete scheduling problems, Journal of Computer and System Sciences, vol.10, issue.3, pp.384-393, 1975. ,
DOI : 10.1016/S0022-0000(75)80008-0
URL : http://doi.org/10.1016/s0022-0000(75)80008-0
Empirical evaluation methods for multiobjective reinforcement learning algorithms, Machine Learning, vol.7, issue.2, pp.51-80, 2010. ,
DOI : 10.1007/s10994-010-5232-5
Multiobjective evolutionary algorithms: classifications, analyses, and new innovations, 1999. ,
Multi-objective Monte-Carlo Tree Search, Asian Conference on Machine Learning, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00758379
Modifications of UCT and sequence-like simulations for Monte-Carlo Go, 2007 IEEE Symposium on Computational Intelligence and Games, pp.175-182, 2007. ,
DOI : 10.1109/CIG.2007.368095
Algorithms for infinitely many-armed bandits, NIPS'08, pp.1-8, 2008. ,
Workflow Scheduling Algorithms for Grid Computing, Studies in Computational Intelligence, vol.146, pp.173-214, 2008. ,
DOI : 10.1007/978-3-540-69277-5_7
Multiobjective optimization using evolutionary algorithms ??? A comparative case study, PPSN V, pp.292-301, 1998. ,
DOI : 10.1007/BFb0056872