N. Modi, Machine learning and statistical decision making for green radio, 2017.
URL : https://hal.archives-ouvertes.fr/tel-01668536

M. A. Marsan, L. Chiaraviglio, D. Ciullo, and M. Meo, Optimal energy savings in cellular access networks, IEEE International Conference on Communications Workshops (ICCW), pp.1-5, 2009.

G. P. Fettweis and E. Zimmermann, ICT energy consumption-trends and challenges, The 11th International Symposium on Wireless Personal Multimedia Communications (WPMC, 2009.

K. Son, H. Kim, Y. Yi, and B. Krishnamachari, Base station operation and user association mechanisms for energy-delay tradeoffs in green cellular networks, IEEE Journal on Selected Areas in Communications, vol.29, issue.8, pp.1525-1536, 2011.

C. Peng, S. Lee, S. Lu, H. Luo, and H. Li, Traffic-driven power saving in operational 3G cellular networks, The 17th Annual International Conference on Mobile Computing and Networking (MobiCom), pp.121-132, 2011.

H. Karl, An overview of energy-efficiency techniques for mobile communication systems, 2003.

E. Oh, B. Krishnamachari, X. Liu, and Z. Niu, Toward dynamic energy-efficient operation of cellular network infrastructure, IEEE Communications Magazine, vol.49, issue.6, pp.56-61, 2011.

N. Modi, P. Mary, and C. Moy, QoS driven channel selection algorithm for opportunistic spectrum access, IEEE Globecom Workshop on Advances in Software Defined Radio Access Networks and Context-aware Cognitive Networks (SDRANCAN), 2015.
URL : https://hal.archives-ouvertes.fr/hal-01251221

C. Robert, C. Moy, and C. Wang, Reinforcement learning approaches and evaluation criteria for opportunistic spectrum access, IEEE International Conference on Communications (ICC), pp.1508-1513, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00994933

N. Modi, P. Mary, and C. Moy, QoS driven Channel Selection Algorithm for Cognitive Radio Network: Multi-User Multi-armed Bandit Approach, IEEE Transactions on Cognitive Communications and Networking, vol.3, issue.1, pp.49-66, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01492886

M. E. Taylor and P. Stone, Transfer learning for reinforcement learning domains: A survey, J. Mach. Learn. Res, vol.10, pp.1633-1685, 2009.

R. Li, Z. Zhao, X. Chen, J. Palicot, and H. Zhang, TACT: A transfer actor-critic learning framework for energy saving in cellular radio access networks, IEEE Transactions on Wireless Communications, vol.13, issue.4, pp.2000-2011, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01073320

Z. Niu, TANGO: traffic-aware network planning and green operation, IEEE Wireless Communications, vol.18, issue.5, pp.25-29, 2011.

L. Chiaraviglio, D. Ciullo, M. Meo, and M. Ajmone-marsan, Energy-aware UMTS access networks, The 11th International Symposium on Wireless Personal Multimedia Communications (WPMC), pp.8-11, 2008.

Z. Niu, Y. Wu, J. Gong, and Z. Yang, Cell zooming for cost-efficient green cellular networks, IEEE Communications Magazine, vol.48, issue.11, pp.74-79, 2010.

R. Li, Z. Zhao, Y. Wei, X. Zhou, and H. Zhang, Gm-pab: A grid-based energy saving scheme with predicted traffic load guidance for cellular networks, IEEE International Conference on Communications (ICC), pp.1160-1164, 2012.

J. Gong, S. Zhou, and Z. Niu, A Dynamic Programming Approach for Base Station Sleeping in Cellular Networks, IEICE Transactions on Communications, vol.95, pp.551-562, 2012.

M. A. Marsan, L. Chiaraviglio, D. Ciullo, and M. Meo, Optimal energy savings in cellular access networks, IEEE International Conference on Communications Workshops (ICCW), pp.1-5, 2009.

M. A. Marsan and M. Meo, Energy efficient management of two cellular access networks. SIGMETRICS Perform, Eval. Rev, vol.37, issue.4, pp.69-73, 2010.

A. S. Alam, L. S. Dooley, and A. S. Poulton, Traffic-and-interference aware base station switching for green cellular networks, 2013 IEEE 18th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD), pp.63-67, 2013.

E. Oh and B. Krishnamachari, Energy savings through dynamic base station switching in cellular wireless access networks, IEEE Global Telecommunications Conference (GLOBECOM), pp.1-5, 2010.

R. M. Karp, Complexity of Computer Computations, Reducibility among Combinatorial Problems, pp.85-103, 1972.

M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, 1979.

F. Han, Z. Safar, and K. J. Liu, Energy-efficient base-station cooperative operation with guaranteed QoS, IEEE Transactions on Communications, vol.61, issue.8, pp.3505-3517, 2013.

Y. S. Soh, T. Q. Quek, M. Kountouris, and H. Shin, Energy efficient heterogeneous cellular networks, IEEE Journal on Selected Areas in Communications, vol.31, issue.5, pp.840-850, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00827440

J. Kim, H. W. Lee, and S. Chong, TAES: Traffic-aware energy-saving base station sleeping and clustering in cooperative networks, 13th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), pp.259-266, 2015.

V. Konda and V. Borkar, Energy-efficient base-station cooperative operation with guaranteed QoS, SIAM J. Contr. Optim, vol.38, issue.1, pp.94-123, 2013.

W. Wong, Y. Yu, and A. Pang, Decentralized energy-efficient base station operation for green cellular networks, IEEE Global Communications Conference (GLOBECOM), pp.5194-5200, 2012.

E. Oh, K. Son, and B. Krishnamachari, Dynamic base station switching-on/off strategies for green cellular networks, IEEE Transactions on Wireless Communications, vol.12, issue.5, pp.2126-2136, 2013.

S. Zhou, J. Gong, Z. Yang, Z. Niu, and P. Yang, Green mobile access network with dynamic base station energy saving, ACM MobiCom, vol.9, issue.262, pp.10-12, 2009.

W. Guo and T. O'farrell, Dynamic cell expansion with self-organizing cooperation, IEEE Journal on Selected Areas in Communications, vol.31, issue.5, pp.851-860, 2013.

C. Tekin and M. Liu, Online learning of rested and restless bandits, IEEE Transactions on Information Theory, vol.58, issue.8, pp.5588-5611, 2012.

J. Oksanen, V. Koivunen, and H. V. Poor, A sensing policy based on confidence bounds and a restless multi-armed bandit model, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp.318-323, 2012.

J. Oksanen and V. Koivunen, An order optimal policy for exploiting idle spectrum in cognitive radio networks, IEEE Transactions on Signal Processing, vol.63, issue.5, pp.1214-1227, 2015.

W. Zhang, Performance of real-time and data traffic in heterogeneous overlay wireless networks, Proceedings of the 19th International Teletraffic Congress, 2005.

M. F. Hossain, K. S. Munasinghe, and A. Jamalipour, Distributed inter-bs cooperation aided energy efficient load balancing for cellular networks, IEEE Transactions on Wireless Communications, vol.12, issue.11, pp.5929-5939, 2013.

H. Kim, G. De-veciana, X. Yang, and M. Venkatachalam, Distributed ?-optimal user association and cell load balancing in wireless networks, IEEE/ACM Transactions on Networking, vol.20, issue.1, pp.177-190, 2012.

K. Son, S. Chong, and G. D. Veciana, Dynamic association for load balancing and interference avoidance in multi-cell networks, IEEE Transactions on Wireless Communications, vol.8, issue.7, pp.3566-3576, 2009.

A. J. Fehske, F. Richter, and G. P. Fettweis, Energy efficiency improvements through micro sites in cellular mobile radio networks, IEEE Globecom Workshops, pp.1-5, 2009.

A. Alam and L. Dooley, A scalable multimode base station switching model for green cellular networks, IEEE Wireless Communications and Networking Conference, 2015.

C. Tekin and M. Liu, Online algorithms for the multi-armed bandit problem with markovian rewards, The 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp.1675-1682, 2010.

C. Tekin and M. Liu, Online learning in opportunistic spectrum access: A restless bandit approach, IEEE INFOCOM, pp.2462-2470, 2011.

C. Wang, S. R. Kulkarni, and H. V. Poor, Bandit problems with side observations, IEEE Transactions on Automatic Control, vol.50, issue.3, pp.338-355, 2005.

N. Cesa-bianchi and G. Lugosi, Prediction, Learning and Games, 2006.

R. Li, Z. Zhao, X. Chen, and H. Zhang, Energy saving through a learning framework in greener cellular radio access networks, IEEE Global Communications Conference (GLOBECOM), pp.1556-1561, 2012.

P. Lezaud, Chernoff-type bound for finite markov chains, Annals of Applied Probability, vol.8, pp.849-867, 1998.
URL : https://hal.archives-ouvertes.fr/hal-00940907

V. Anantharam, P. Varaiya, and J. Walrand, Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part ii: Markovian rewards, IEEE Transactions on Automatic Control, vol.32, issue.11, pp.977-982, 1987.