submit
english version rss feed
HAL: hal-00113668, version 1

Detailed view  Export this paper
Multi-armed Bandit, Dynamic Environments and Meta-Bandits
Cédric Hartland 1, 2, Sylvain Gelly 1, 2, Nicolas Baskiotis 1, 2, Olivier Teytaud 1, 2, Michèle Sebag 1, 2
(2006-11-14)

This paper presents the Adapt-EvE algorithm, extending the UCBT online learning algorithm (Auer et al. 2002) to abruptly changing environments. Adapt-EvE features an adaptive change-point detection test based on Page-Hinkley statistics, and two alternative xtra-exploration procedures respectively based on smooth-restart and Meta-Bandits.
1:  Laboratoire de Recherche en Informatique (LRI)
CNRS : UMR8623 – Université Paris XI - Paris Sud
2:  TAO (INRIA Futurs)
INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud
Laboratoire de Recherche en Informatique
Computer Science/Learning

Computer Science/Artificial Intelligence
multi-armed bandit – statistical learning – ucb
Attached file list to this document: 
PDF
MetaEve.pdf(106.7 KB)

all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...
all articles on CCSd database...