Skip to Main content Skip to Navigation

Preference-based Evolutionary Direct Policy Search

Robert Busa-Fekete Balazs Szorenyi Paul Weng 1 Weiwei Cheng Eyke Hüllermeier
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : We present a novel approach to preference-based reinforcement learning, namely a preference-based variant of a direct policy search method based on evolutionary optimization. The core of our approach is a preference-based racing algorithm that selects the best among a given set of candidate policies with high probability. To this end, the algorithm operates on a suitable ordinal preference structure and only uses pairwise comparisons between sample rollouts of the policies. We present first experimental studies showing that our approach performs well in practice.
Document type :
Conference papers
Complete list of metadatas
Contributor : Lip6 Publications <>
Submitted on : Thursday, October 15, 2015 - 3:24:36 PM
Last modification on : Thursday, March 21, 2019 - 2:30:34 PM


  • HAL Id : hal-01216088, version 1


Robert Busa-Fekete, Balazs Szorenyi, Paul Weng, Weiwei Cheng, Eyke Hüllermeier. Preference-based Evolutionary Direct Policy Search. ICRA Autonomous Learning Workshop, May 2013, Karlsruhe, Germany. ⟨hal-01216088⟩



Record views