A penalized bandit algorithm

Damien Lamberton; Gilles Pagès

doi:10.1214/EJP.v13-489

Journal Articles Electronic Journal of Probability Year : 2008

A penalized bandit algorithm

(1) , (2)

1
2

Damien Lamberton

Function : Author
PersonId : 10233
IdHAL : damien-lamberton
ORCID : 0009-0003-9817-2424
IdRef : 032457308

Laboratoire d'Analyse et de Mathématiques Appliquées

Gilles Pagès

Function : Author
PersonId : 8458
IdHAL : gilles-pages
ORCID : 0000-0001-6487-3079
IdRef : 030737605

Laboratoire de Probabilités et Modèles Aléatoires

Abstract

We study a two armed-bandit algorithm with penalty. We show the convergence of the algorithm and establish the rate of convergence. For some choices of the parameters, we obtain a central limit theorem in which the limit distribution is characterized as the unique stationary distribution of a discontinuous Markov process.

Keywords

Two-armed bandit algorithm Stochastic Approximation learning automata asset allocation

Domains

Probability [math.PR]

Fichier principal

PenalBandit.pdf (293.93 Ko)

Gilles Pagès : Connect in order to contact the contributor

https://hal.science/hal-00012187

Submitted on : Tuesday, October 18, 2005-5:57:56 PM

Last modification on : Thursday, March 14, 2024-3:08:17 AM

Long-term archiving on: Thursday, April 1, 2010-10:47:25 PM

Dates and versions

hal-00012187 , version 1 (18-10-2005)

Identifiers

HAL Id : hal-00012187 , version 1
ARXIV : math.PR/0510384
DOI : 10.1214/EJP.v13-489

Cite

Damien Lamberton, Gilles Pagès. A penalized bandit algorithm. Electronic Journal of Probability, 2008, 13, 341-373 ; http://dx.doi.org/10.1214/EJP.v13-489. ⟨10.1214/EJP.v13-489⟩. ⟨hal-00012187⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-PARIS7 UPMC PMA CNRS UNIV-MLV LAMA_UMR8050 LAMA_PS UPEC LPSM SORBONNE-UNIVERSITE SU-SCIENCES UNIV-EIFFEL

110 View

224 Download

A penalized bandit algorithm

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Altmetric

Share