Asymptotically optimal priority policies for indexable and non-indexable restless bandits

Ina Maria Maaike Verloop

Pré-Publication, Document De Travail Année : 2014

Asymptotically optimal priority policies for indexable and non-indexable restless bandits

(1, 2)

1
2

Ina Maria Maaike Verloop

Fonction : Auteur correspondant
PersonId : 738383
IdHAL : maaike-verloop
IdRef : 188434208

Connectez-vous pour contacter l'auteur

Réseaux, Mobiles, Embarqués, Sans fil, Satellites

Centre National de la Recherche Scientifique

Résumé

We study the asymptotic optimal control of multi-class restless bandits. A restless bandit is a controllable stochastic process whose state evolution depends on whether or not the bandit is made active. Since finding the optimal control is typically intractable, we propose a class of priority policies that are proved to be asymptotically optimal under a global attractor property and a technical condition. We consider both a fixed population of bandits as well as a dynamic population where bandits can depart and arrive. As an example of a dynamic population of bandits, we analyze a multi-class M/M/S+M queue for which we show asymptotic optimality of an index policy.We combine fluid-scaling techniques with linear programming results to prove that when bandits are indexable, Whittle's index policy is included in our class of priority policies. We thereby generalize a result of Weber and Weiss (1990) about asymptotic optimality of Whittle's index policy to settings with (i) several classes of bandits, (ii) arrivals of new bandits, and (iii) multiple actions. Indexability of the bandits is not required for our results to hold. For non-indexable bandits we describe how to select priority policies from the class of asymptotically optimal policies and present numerical evidence that, outside the asymptotic regime, the performance of our proposed priority policies is nearly optimal.

Mots clés

arm-aquiring bandits non-indexable bandits Restless bandits Whittle's index policy asymptotic optimality

Domaines

Optimisation et contrôle [math.OC]

Fichier principal

HAL_RBP (1).pdf (388.52 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Ina Maria Verloop : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00743781

Soumis le : lundi 29 février 2016-14:44:09

Dernière modification le : lundi 20 novembre 2023-11:44:23

Archivage à long terme le : lundi 30 mai 2016-15:20:16

Dates et versions

hal-00743781 , version 1 (22-10-2012)

hal-00743781 , version 2 (03-09-2013)

hal-00743781 , version 3 (07-07-2014)

hal-00743781 , version 4 (01-04-2015)

hal-00743781 , version 5 (09-09-2015)

hal-00743781 , version 6 (29-02-2016)

Identifiants

HAL Id : hal-00743781 , version 6

Citer

Ina Maria Maaike Verloop. Asymptotically optimal priority policies for indexable and non-indexable restless bandits. 2014. ⟨hal-00743781v6⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS INSMI SMS UT1-CAPITOLE TDS-MACS IRIT IRIT-RMESS IRIT-ASR TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

516 Consultations

895 Téléchargements

Asymptotically optimal priority policies for indexable and non-indexable restless bandits

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager