J. Abernethy, C. Lee, A. Sinha, and A. Tewari, Online linear optimization via smoothing, Proceedings of The 27th Conference on Learning Theory (COLT), pp.807-823, 2014.

C. Allenberg, P. Auer, L. Györfi, and G. Ottucsák, Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring, Proceedings of the 17th International Conference on Algorithmic Learning Theory (ALT), pp.229-243, 2006.
DOI : 10.1007/11894841_20

J. Audibert and S. Bubeck, Regret bounds and minimax policies under partial monitoring, Journal of Machine Learning Research, vol.11, pp.2635-2686, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00654356

J. Audibert, S. Bubeck, and G. Lugosi, Regret in Online Combinatorial Optimization, Mathematics of Operations Research, vol.39, issue.1, pp.31-45, 2014.
DOI : 10.1287/moor.2013.0598

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002.
DOI : 10.1137/S0097539701398375

B. Awerbuch and R. D. Kleinberg, Adaptive routing with end-to-end feedback, Proceedings of the thirty-sixth annual ACM symposium on Theory of computing , STOC '04, pp.45-53, 2004.
DOI : 10.1145/1007352.1007367

A. Beygelzimer, J. Langford, L. Li, L. Reyzin, and R. E. Schapire, Contextual bandit algorithms with supervised learning guarantees, Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS), pp.19-26, 2011.

S. Bubeck, N. Cesa-bianchi, and S. M. Kakade, Towards minimax policies for online linear optimization with bandit feedback, Proceedings of The 25th Conference on Learning Theory (COLT), pp.1-14, 2012.

S. Bubeck and N. Cesa-bianchi, Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Foundations and Trends?? in Machine Learning, vol.5, issue.1, 2012.
DOI : 10.1561/2200000024

N. Cesa-bianchi and G. Lugosi, Prediction, Learning, and Games, 2006.
DOI : 10.1017/CBO9780511546921

N. Cesa-bianchi and G. Lugosi, Combinatorial bandits, Journal of Computer and System Sciences, vol.78, issue.5, pp.1404-1422, 2012.
DOI : 10.1016/j.jcss.2012.01.001

V. Dani, T. Hayes, and S. Kakade, The price of bandit information for online optimization, Advances in Neural Information Processing Systems (NIPS), pp.345-352, 2008.

L. Devroye, G. Lugosi, and G. Neu, Prediction by random-walk perturbation, Proceedings of the 26th Conference on Learning Theory, pp.460-473, 2013.

Y. Freund and R. Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Journal of Computer and System Sciences, vol.55, issue.1, pp.119-139, 1997.
DOI : 10.1006/jcss.1997.1504

A. György, T. Linder, G. Lugosi, and G. Ottucsák, The on-line shortest path problem under partial monitoring, Journal of Machine Learning Research, vol.8, pp.2369-2403, 2007.

J. Hannan, Approximation to Bayes risk in repeated play, Contributions to the Theory of Games, pp.97-139, 1957.

A. Kalai and S. Vempala, Efficient algorithms for online decision problems, Journal of Computer and System Sciences, vol.71, issue.3, pp.291-307, 2005.
DOI : 10.1016/j.jcss.2004.10.016

W. Koolen, M. Warmuth, and J. Kivinen, Hedging structured concepts, Proceedings of the 23rd Conference on Learning Theory (COLT), pp.93-105, 2010.

N. Littlestone and M. Warmuth, The Weighted Majority Algorithm, Information and Computation, vol.108, issue.2, pp.212-261, 1994.
DOI : 10.1006/inco.1994.1009

H. B. Mcmahan and A. Blum, Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary, Proceedings of the 17th Conference on Learning Theory (COLT), pp.109-123, 2004.
DOI : 10.1007/978-3-540-27819-1_8

G. Neu and G. Bartók, An Efficient Algorithm for Learning with Semi-bandit Feedback, Proceedings of the 24th International Conference on Algorithmic Learning Theory (ALT), pp.234-248, 2013.
DOI : 10.1007/978-3-642-40935-6_17

J. Poland, FPL Analysis for Adaptive Bandits, 3rd Symposium on Stochastic Algorithms, Foundations and Applications (SAGA), pp.58-69, 2005.
DOI : 10.1007/11571155_7

S. Rakhlin, O. Shamir, and K. Sridharan, Relax and randomize: From value to algorithms, Advances in Neural Information Processing Systems (NIPS), pp.2150-2158, 2012.

D. Suehiro, K. Hatano, S. Kijima, E. Takimoto, and K. Nagano, Online Prediction under Submodular Constraints, Proceedings of the 23rd International Conference on Algorithmic Learning Theory (ALT), pp.260-274, 2012.
DOI : 10.1007/978-3-642-34106-9_22

E. Takimoto and M. Warmuth, Path Kernels and Multiplicative Updates, Journal of Machine Learning Research, vol.4, pp.773-818, 2003.
DOI : 10.1007/3-540-45435-7_6

T. Uchiya, A. Nakamura, and M. Kudo, Algorithms for Adversarial Bandit Problems with Multiple Plays, Proceedings of the 21st International Conference on Algorithmic Learning Theory (ALT), pp.375-389, 2010.
DOI : 10.1007/978-3-642-16108-7_30

T. Van-erven, M. Warmuth, and W. Kott-lowski, Follow the leader with dropout perturbations, Proceedings of The 27th Conference on Learning Theory (COLT), pp.949-974, 2014.

V. Vovk, AGGREGATING STRATEGIES, Proceedings of the 3rd Annual Workshop on Computational Learning Theory (COLT), pp.371-386, 1990.
DOI : 10.1016/B978-1-55860-146-8.50032-1