Skip to Main content Skip to Navigation
Journal articles

On the Whittle index of Markov Modulated Restless Bandits

Abstract : In this paper we study a Multi-Armed Restless Bandit Problem (MARBP) subject to time fluctuations. This model has numerous applications in practice, like in cloud computing systems or in wireless communications networks. Each bandit is formed by two processes: a controllable process and an environment. The transition rates of the controllable process are determined by the state of the environment, which is an exogenous Markov process. The decision maker has full information on the state of every bandit, and the objective is to determine the optimal policy that minimises the long-run average cost. Given the complexity of the problem, we set out to characterise the Whittle index, which is obtained by solving a relaxed version of the MARBP. As reported in the literature, this heuristic performs extremely well for a wide variety of problems. Assuming that the optimal policy of the relaxed problem is of threshold type, we provide an algorithm that finds Whittle's index. We then consider a multi-class queue with linear cost and impatient customers. For this model, we show threshold optimality, prove indexability, and obtain Whittle's index in closed-form. We also study the limiting regimes in which the environment is relatively slower and faster than the controllable process. By numerical simulations, we assess the suboptimality of Whittle's index policy in a wide variety of scenarios, and the general observation is that, as in the case of standard MARBP, the suboptimality gap of Whittle's index policy is small.
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03579521
Contributor : Urtzi Ayesta Connect in order to contact the contributor
Submitted on : Friday, February 18, 2022 - 10:16:12 AM
Last modification on : Tuesday, July 5, 2022 - 11:22:33 AM
Long-term archiving on: : Thursday, May 19, 2022 - 6:30:55 PM

File

2022-Whittle-Modulation_Abando...
Files produced by the author(s)

Identifiers

Citation

Santiago Guillermo Duran, Urtzi Ayesta, Ina Maria Maaike Verloop. On the Whittle index of Markov Modulated Restless Bandits. Queueing Systems, Springer Verlag, 2022, pp.1-55. ⟨10.1007/s11134-022-09737-y⟩. ⟨hal-03579521⟩

Share

Metrics

Record views

120

Files downloads

71