Faster Rates for Policy Learning

Alexander R Luedtke; Antoine Chambaz

Pré-Publication, Document De Travail Année : 2017

Faster Rates for Policy Learning

(1) , (2, 3)

1
2
3

Alexander R Luedtke

Fonction : Auteur

Fred Hutchinson Cancer Research Center [Seattle]

Antoine Chambaz

Fonction : Auteur
PersonId : 867345

Division of Biostatistics

Modélisation aléatoire de Paris X

Résumé

This article improves the existing proven rates of regret decay in optimal policy estimation. We give a margin-free result showing that the regret decay for estimating a within-class optimal policy is second-order for empirical risk minimizers over Donsker classes, with regret decaying at a faster rate than the standard error of an efficient estimator of the value of an optimal policy. We also give a result from the classification literature that shows that faster regret decay is possible via plug-in estimation provided a margin condition holds. Four examples are considered. In these examples, the regret is expressed in terms of either the mean value or the median value; the number of possible actions is either two or finitely many; and the sampling scheme is either independent and identically distributed or sequential, where the latter represents a contextual bandit sampling scheme.

Mots clés

individualized treatment rules personalized medicine policy learning precision medicine

Domaines

Statistiques [math.ST]

Fichier principal

fasterRatesForPolicyLearning.pdf (372.32 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Antoine Chambaz : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01511409

Soumis le : jeudi 20 avril 2017-22:20:01

Dernière modification le : samedi 27 avril 2024-03:15:12

Archivage à long terme le : vendredi 21 juillet 2017-14:11:59

Dates et versions

hal-01511409 , version 1 (20-04-2017)

Identifiants

HAL Id : hal-01511409 , version 1
ARXIV : 1704.06431

Citer

Alexander R Luedtke, Antoine Chambaz. Faster Rates for Policy Learning. 2017. ⟨hal-01511409⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INSMI MODALX UNIV-PARIS-LUMIERES ANR UNIV-PARIS-NANTERRE

267 Consultations

84 Téléchargements

Faster Rates for Policy Learning

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Relations

Exporter

Collections

Altmetric

Partager