E. C. Strinati, S. Barbarossa, J. L. Gonzalez-jimenez, D. Kténas, N. Cassiau et al., 6G: The next frontier, 2019.

T. Chen, S. Barbarossa, X. Wang, G. B. Giannakis, and Z. Zhang, Learning and management for Internet-of-Things: Accounting for adaptivity and scalability, 2018.

G. J. Foschini and Z. Miljanic, A simple distributed autonomous power control algorithm and its convergence, IEEE Trans. Veh. Technol, vol.42, issue.4, pp.641-646, 1993.

W. Yu, W. Rhee, S. Boyd, and J. M. Cioffi, Iterative water-filling for Gaussian vector multiple-access channels, IEEE Trans. Inf. Theory, vol.50, issue.1, pp.145-152, 2004.

G. Scutari, D. P. Palomar, and S. Barbarossa, The MIMO iterative waterfilling algorithm, IEEE Trans. Signal Process, vol.57, issue.5, pp.1917-1935, 2009.

C. Isheden, Z. Chong, E. Jorswieck, and G. Fettweis, Framework for link-level energy efficiency optimization with informed transmitter, IEEE Trans. Wireless Commun, vol.11, issue.8, pp.2946-2957, 2012.

P. Mertikopoulos, E. V. Belmega, R. Negrel, and L. Sanguinetti, Distributed stochastic optimization via matrix exponential learning, IEEE Trans. Signal Process, vol.65, issue.9, pp.2277-2290, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01382285

W. Li, M. Assaad, and P. Duhamel, Distributed stochastic optimization in networks with low informational exchange, 55th Annual Allerton Conf. on Commun., Control, and Computing, pp.1160-1167, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01578376

P. Mertikopoulos and E. V. Belmega, Transmit without regrets: Online optimization in MIMO-OFDM cognitive radio systems, IEEE J. Sel. Areas Commun, vol.32, issue.11, 1987.
URL : https://hal.archives-ouvertes.fr/hal-01073500

, Learning to be green: Robust energy efficiency maximization in dynamic MIMO-OFDM systems, IEEE J. Sel. Areas Commun, vol.34, issue.4, pp.743-757, 2016.

A. Marcastel, E. V. Belmega, P. Mertikopoulos, and I. Fijalkow, Online power allocation for opportunistic radio access in dynamic OFDM networks, IEEE 84th Veh. Technol. Conf. (VTC-Fall), 2016.
URL : https://hal.archives-ouvertes.fr/hal-01387044

, Online interference mitigation via learning in dynamic IoT environments, IEEE Globecom IoE Workshop, 2016.

S. Shalev-shwartz, Online learning and online convex optimization, Foundations and Trends in Machine Learning, vol.4, pp.107-194, 2011.

S. Bubeck and N. Cesa-bianchi, Regret analysis of stochastic and nonstochastic multi-armed bandit problems, Foundations and Trends in Machine Learning, vol.5, pp.1-122, 2012.

T. Alpcan, T. Ba?ar, R. Srikant, and E. Altman, CDMA uplink power control as a noncooperative game, Wireless Networks, vol.8, issue.6, pp.659-670, 2002.

R. Masmoudi, E. V. Belmega, I. Fijalkow, and N. Sellami, A unifying view on energy-efficiency metrics in cognitive radio channels, European Signal Process. Conf. (EUSIPCO), pp.171-175, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01104190

J. C. Spall, Multivariate stochastic approximation using a simultaneous perturbation gradient approximation, IEEE Trans. Autom. Control, vol.37, issue.3, pp.332-341, 1992.

A. D. Flaxman, A. T. Kalai, and H. B. Mcmahan, Online convex optimization in the bandit setting: gradient descent without a gradient, SODA'05: 16th ACM-SIAM Symp. on Discrete Algorithms, pp.385-394, 2005.

M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, Intl. Conf. on Machine Learning (ICML-03), pp.928-936, 2003.

G. F. Pedersen, COST 231-Digital mobile radio towards future generation systems, EU, 1999.

G. Calcev, A wideband spatial channel model for system-wide simulations, IEEE Trans. Veh. Technol, vol.56, issue.2, pp.389-403, 2007.