Further optimal regret bounds for thompson sampling, Proceedings of the 16th Conference on Artificial Intelligence and Statistics, 2013. ,
Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends R in Machine Learning, vol.5, pp.1-122, 2012. ,
Kullback-leibler upper confidence bounds for optimal sequential allocation, The Annals of Statistics, vol.41, issue.3, pp.1516-1541, 2013. ,
An empirical evaluation of thompson sampling, Advances in neural information processing systems, pp.2249-2257, 2011. ,
Click models for web search, Synthesis Lectures on Information Concepts, Retrieval, and Services, vol.7, issue.3, pp.1-115, 2015. ,
Unimodal bandits: Regret lower bounds and optimal algorithms, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01092662
Minimal exploration in structured stochastic bandits, Advances in Neural Information Processing Systems, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-02395029
, Bilinear bandits with low-rank structure, 2019.
Bernoulli rank-1 bandits for click feedback, IJCAI, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-02287914
Stochastic rank-1 bandits, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017. ,
Thompson Sampling : an Asymptotically Optimal Finite-Time Analysis, Proceedings of the 23rd conference on Algorithmic Learning Theory, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00830033