Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals

Emilie Kaufmann 1 Wouter Koolen 2
1 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : This paper presents new deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model. The deviations are measured using the Kullback-Leibler divergence in a given one-dimensional exponential family, and may take into account several arms at a time. They are obtained by constructing for each arm a mixture martingale based on a hierarchical prior, and by multiplying those martingales. Our deviation inequalities allow us to analyze stopping rules based on generalized likelihood ratios for a large class of sequential identification problems, and to construct tight confidence intervals for some functions of the means of the arms.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01886612
Contributor : Emilie Kaufmann <>
Submitted on : Tuesday, November 27, 2018 - 9:18:00 AM
Last modification on : Friday, April 19, 2019 - 4:55:23 PM
Long-term archiving on : Thursday, February 28, 2019 - 12:40:57 PM

Files

KK18_JMLR.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01886612, version 2
  • ARXIV : 1811.11419

Citation

Emilie Kaufmann, Wouter Koolen. Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals. 2018. ⟨hal-01886612v2⟩

Share

Metrics

Record views

29

Files downloads

164