Bandits with concave rewards and convex knapsacks, Proceedings of the fifteenth ACM conference on Economics and computation, EC '14, pp.989-1006, 2014. ,
DOI : 10.1145/2600057.2602844
Analysis of thompson sampling for the multi-armed bandit problem. arXiv preprint, 2011. ,
Further optimal regret bounds for thompson sampling. arXiv preprint, 2012. ,
DOI : 10.1145/3088510
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: IID rewards. Automatic Control, IEEE Transactions on, vol.32, issue.11, pp.968-976, 1987. ,
DOI : 10.1109/tac.1987.1104491
Minimax policies for combinatorial prediction games. arXiv preprint, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00624463
Bandits with Knapsacks, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pp.207-216, 2013. ,
DOI : 10.1109/FOCS.2013.30
Optimal Adaptive Policies for Sequential Allocation Problems, Advances in Applied Mathematics, vol.17, issue.2, pp.122-142, 1996. ,
DOI : 10.1006/aama.1996.0007
URL : https://doi.org/10.1006/aama.1996.0007
Kullback?leibler upper confidence bounds for optimal sequential allocation. The Annals of Statistics, pp.1516-1541, 2013. ,
Kullback???Leibler upper confidence bounds for optimal sequential allocation, The Annals of Statistics, vol.41, issue.3, pp.1516-1541, 2013. ,
DOI : 10.1214/13-AOS1119SUPP
Combinatorial bandits, Journal of Computer and System Sciences, vol.78, issue.5, pp.1404-1422, 2012. ,
DOI : 10.1016/j.jcss.2012.01.001
Combinatorial multi-armed bandit: General framework and applications, Proceedings of the 30th International Conference on Machine Learning, pp.151-159, 2013. ,
Learning to Rank: Regret Lower Bounds and Efficient Algorithms, Proceedings of the 2015 {ACM} {SIGMETRICS} International Conference on Measurement and Modeling of Computer Systems, pp.231-244, 2015. ,
DOI : 10.1145/2745844.2745852
Combinatorial Bandits Revisited, Advances in Neural Information Processing Systems, pp.2107-2115, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01257796
Discrete-Variable Extremum Problems, Operations Research, vol.5, issue.2, pp.266-288, 1957. ,
DOI : 10.1287/opre.5.2.266
Explore First, Exploit Next: The True Shape of Regret in Bandit Problems, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01276324
Bandit processes and dynamic allocation indices, Journal of the Royal Statistical Society, Series B, vol.41, issue.2, pp.148-177, 1979. ,
DOI : 10.1002/9780470980033
Reducibility among combinatorial problems, 1972. ,
DOI : 10.1007/978-3-540-68279-0_8
On Bayesian upper confidence bounds for bandit problems, International Conference on Artificial Intelligence and Statistics, pp.592-600, 2012. ,
Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis, Algorithmic Learning Theory, pp.199-213, 2012. ,
DOI : 10.1007/978-3-642-34106-9_18
URL : https://hal.archives-ouvertes.fr/hal-00830033
Optimal regret analysis of thompson sampling in stochastic multi-armed bandit problem with multiple plays, 2015. ,
Thompson sampling for 1-dimensional exponential family bandits, Advances in Neural Information Processing Systems, pp.1448-1456, 2013. ,
Matroid Bandits: Fast Combinatorial Optimization with Learning, Uncertainty in Artificial Intelligence (UAI), 2014. ,
Cascading Bandits: Learning to Rank in the Cascade Model, Proceedings of the 32nd International Conference on Machine Learning, pp.767-776, 2015. ,
Combinatorial Cascading Bandits, Advances in Neural Information Processing Systems (NIPS), 2015. ,
Multiple-Play Bandits in the Postition-Based Model, 2016. ,
Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985. ,
DOI : 10.1016/0196-8858(85)90002-8
URL : https://doi.org/10.1016/0196-8858(85)90002-8
Infinitely Many-Armed Bandits with Budget Constraints, AAAI, pp.2182-2188, 2017. ,
Asymptotically Optimal Algorithms for Multiple Play Bandits with Partial Feedback, 2016. ,
R: a language and environment for statistical computing, 2014. ,
Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952. ,
DOI : 10.1090/S0002-9904-1952-09620-8
ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES, Biometrika, vol.25, issue.3-4, pp.285-294, 1933. ,
DOI : 10.1093/biomet/25.3-4.285
Knapsack Based Optimal Policies for Budget-Limited Multi-Armed Bandits, AAAI, 2012. ,
Efficient Learning in Large-Scale Combinatorial Semi- Bandits, International Conference on Machine Learning (ICML), 2015. ,
Thompson Sampling for Budgeted Multi-Armed Bandits, IJCAI, pp.3960-3966, 2015. ,
Budgeted bandit problems with continuous random costs, Asian Conference on Machine Learning, pp.317-332, 2016. ,
Budgeted Multi-Armed Bandits with Multiple Plays, IJCAI, pp.2210-2216, 2016. ,