Path Integral Policy Improvement with Covariance Matrix Adaptation

Freek Stulp; Olivier Sigaud

Communication Dans Un Congrès Année : 2012

Path Integral Policy Improvement with Covariance Matrix Adaptation

(1, 2) , (3)

1
2
3

Freek Stulp

Fonction : Auteur
PersonId : 1420
IdHAL : freek-stulp
IdRef : 177920629

Flowing Epigenetic Robots and Systems

Robotique et Vision

Olivier Sigaud

Fonction : Auteur
PersonId : 14932
IdHAL : olivier-sigaud
ORCID : 0000-0002-8544-0229
IdRef : 072724714

Institut des Systèmes Intelligents et de Robotique

Résumé

There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies. PI2 is a recent example of this approach. It combines a derivation from first principles of stochastic optimal control with tools from statistical estimation theory. In this paper, we consider PI2- as a member of the wider family of methods which share the concept of probability-weighted averaging to iteratively update parameters to optimize a cost function. At the conceptual level, we compare PI2 to other members of the same family, being Cross-Entropy Methods and CMAES. The comparison suggests the derivation of a novel algorithm which we call PI2-CMA for ''Path Integral Policy Improvement with Covariance Matrix Adaptation''. PI2-CMA's main advantage is that it determines the magnitude of the exploration noise automatically

Domaines

Robotique [cs.RO]

Freek Stulp : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00789391

Soumis le : lundi 18 février 2013-10:57:31

Dernière modification le : vendredi 24 mars 2023-14:52:56

Dates et versions

hal-00789391 , version 1 (18-02-2013)

Identifiants

HAL Id : hal-00789391 , version 1

Citer

Freek Stulp, Olivier Sigaud. Path Integral Policy Improvement with Covariance Matrix Adaptation. Proceedings of the 29th International Conference on Machine Learning (ICML), Jun 2012, Edinbourg, United Kingdom. pp.0-0. ⟨hal-00789391⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UPMC ENSTA CNRS INRIA ISIR ENSTA_U2IS INRIA2 SORBONNE-UNIVERSITE SU-SCIENCES ISIR_AMAC

147 Consultations

0 Téléchargements

Path Integral Policy Improvement with Covariance Matrix Adaptation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager