R. Prescott, A. David, and J. Mackay, Bayesian online changepoint detection. arXiv preprint, 2007.

R. Allesiardo and R. Féraud, EXP3 with drift detection for the switching bandit problem, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp.1-7, 2015.
DOI : 10.1109/DSAA.2015.7344834

R. Allesiardo, R. Féraud, and O. Maillard, The non-stationary stochastic multi-armed bandit problem, International Journal of Data Science and Analytics, vol.25, issue.1, pp.1-17, 2017.
DOI : 10.1145/1553374.1553524

URL : https://hal.archives-ouvertes.fr/hal-01575000

P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2003.
DOI : 10.1137/S0097539701398375

O. Besbes, Y. Gur, and A. Zeevi, Stochastic multi-armed-bandit problem with nonstationary rewards, Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS'14, pp.199-207, 2014.

O. Cappé, A. Garivier, and R. Maillard, Odalric-Ambrym Munos, and Gilles Stoltz. Kullback?leibler upper confidence bounds for optimal sequential allocation. The Annals of Statistics, pp.1516-1541, 2013.

O. Chapelle and L. Li, An empirical evaluation of thompson sampling, NIPS, pp.2249-2257, 2011.

A. Garivier and E. Moulines, On Upper-Confidence Bound Policies for Switching Bandit Problems, International Conference on Algorithmic Learning Theory, pp.174-188, 2011.
DOI : 10.1017/S0021900200040420

C. Hartland, S. Gelly, N. Baskiotis, O. Teytaud, and M. Sébag, Multiarmed bandit, dynamic environments and meta-bandits, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00113668

E. Kaufmann, N. Korda, and R. Munos, Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis, International Conference on Algorithmic Learning Theory, pp.199-213, 2012.
DOI : 10.1007/978-3-642-34106-9_18

URL : https://hal.archives-ouvertes.fr/hal-00830033

T. Leung, L. , and H. Robbins, Asymptotically efficient adaptive allocation rules, Advances in applied mathematics, vol.6, issue.1, pp.4-22, 1985.

J. Mellor and J. Shapiro, Thompson sampling in switching environments with bayesian online change point detection. CoRR, abs/1302, 2013.

G. Neu, Explore no more: Improved high-probability regret bounds for non-stochastic bandits, Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS'15, pp.3168-3176, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01223501

R. William and . Thompson, On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika, vol.25, issue.34, pp.285-294, 1933.

R. Turner, Y. Saatci, and C. E. Rasmussen, Adaptive sequential Bayesian change point detection, Advances in Neural Information Processing Systems (NIPS): Temporal Segmentation Workshop, 2009.