| HAL: hal-00113668, version 1 |
| Detailed view | Export this paper |
|
|
|
|
| Multi-armed Bandit, Dynamic Environments and Meta-Bandits |
|
|
| Cédric Hartland 1, 2Sylvain Gelly 1, 2 |
|
|
| (2006-11-14) |
|
|
| This paper presents the Adapt-EvE algorithm, extending the UCBT online learning algorithm (Auer et al. 2002) to abruptly changing environments. Adapt-EvE features an adaptive change-point detection test based on Page-Hinkley statistics, and two alternative xtra-exploration procedures respectively based on smooth-restart and Meta-Bandits. |
|
|
|
|
|
|
|
|
|
|
| 1: | Laboratoire de Recherche en Informatique (LRI) |
| CNRS : UMR8623 – Université Paris XI - Paris Sud | |
| 2: | TAO (INRIA Futurs) |
| INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud | |
|
|
|
|
|
|
|
|
| Laboratoire de Recherche en Informatique |
|
|
|
|
| Subject | : | Computer Science/Learning Computer Science/Artificial Intelligence |
|
|
| multi-armed bandit – statistical learning – ucb |
|
|
| Attached file list to this document: | |||||
|
|
|
| hal-00113668, version 1 | |
| http://hal.archives-ouvertes.fr/hal-00113668 | |
| oai:hal.archives-ouvertes.fr:hal-00113668 | |
| From: Cédric Hartland | |
| Submitted on: Tuesday, 14 November 2006 10:48:45 | |
| Updated on: Friday, 1 December 2006 14:09:39 | |