Finite-time analysis of the multi-armed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002. ,
DOI : 10.1023/A:1013689704352
Concentration inequalities: A nonasymptotic theory of independence, 2013. ,
DOI : 10.1093/acprof:oso/9780199535255.001.0001
URL : https://hal.archives-ouvertes.fr/hal-00794821
Bandits Games and Clustering Foundations, 2010. ,
URL : https://hal.archives-ouvertes.fr/tel-00845565
Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and trends in machine learning, pp.1-122, 2012. ,
The KL-UCB algorithm for bounded stochastic bandits and beyond, Annual Conference on Learning Theory (COLT), 2011. ,
Probability Inequalities for Sums of Bounded Random Variables, Journal of the American Statistical Association, vol.1, issue.301, pp.13-30, 1963. ,
DOI : 10.1214/aoms/1177730491
Contribution to learning and decision making under uncertainty for Cognitive Radio ,
URL : https://hal.archives-ouvertes.fr/tel-00765437
Channel selection with Rayleigh fading: A multi-armed bandit framework, 2012 IEEE 13th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp.299-303, 2012. ,
DOI : 10.1109/SPAWC.2012.6292914
URL : https://hal.archives-ouvertes.fr/hal-00721010
Thompson sampling for 1- dimensional exponential family bandits, Advances in Neural Information Processing Systems, pp.1448-1456, 2013. ,
Asymptotically efficient adaptive allocation rules, Advances in applied mathematics, vol.6, issue.1, pp.4-22, 1985. ,