Hierarchical Optimistic Region Selection driven by Curiosity

Odalric-Ambrym Maillard

Pré-Publication, Document De Travail Année : 2012

Hierarchical Optimistic Region Selection driven by Curiosity

(1)

Odalric-Ambrym Maillard

Fonction : Auteur
PersonId : 5563
IdHAL : odalric-ambrym-maillard
ORCID : 0000-0001-7935-7026
IdRef : 158055594

Montanuniversität Leoben

Résumé

This paper aims to take a step forwards making the term ''intrinsic motivation'' from reinforcement learning theoretically well founded, focusing on curiosity-driven learning. To that end, we consider the setting where, a fixed partition \P of a continuous space \X being given, and a process \nu defined on \X being unknown, we are asked to sequentially decide which cell of the partition to select as well as where to sample \nu in that cell, in order to minimize a loss function that is inspired from previous work on curiosity-driven learning. The loss on each cell consists of one term measuring a simple worst case quadratic sampling error, and a penalty term proportional to the range of the variance in that cell. The corresponding problem formulation extends the setting known as active learning for multi-armed bandits to the case when each arm is a continuous region, and we show how an adaptation of recent algorithms for that problem and of hierarchical optimistic sampling algorithms for optimization can be used in order to solve this problem. The resulting procedure, called Hierarchical Optimistic Region SElection driven by Curiosity (HORSE.C) is provided together with a finite-time regret analysis.

Domaines

Statistiques [math.ST] Théorie [stat.TH] Apprentissage [cs.LG]

Fichier principal

horsec_nips_supplementary.pdf (348 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Odalric-Ambrym Maillard : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00740418

Soumis le : jeudi 25 octobre 2012-07:00:08

Dernière modification le : lundi 8 avril 2024-12:01:22

Archivage à long terme le : vendredi 16 décembre 2016-22:21:50

Dates et versions

hal-00740418 , version 1 (25-10-2012)

Identifiants

HAL Id : hal-00740418 , version 1

Citer

Odalric-Ambrym Maillard. Hierarchical Optimistic Region Selection driven by Curiosity. 2012. ⟨hal-00740418⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

77 Consultations

89 Téléchargements

Hierarchical Optimistic Region Selection driven by Curiosity

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager