HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments

Abstract : In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, where an apt agent Bob acts following its own goals, without necessarily providing helpful demonstrations, and where the objective of an agent is to learn to control objects individually. We present a generic discrete-state discrete-action model of such environments, and an unsupervised reinforcement learning agent called CLIC for Curriculum Learning and Imitation for Control to achieve the desired objective. CLIC selects objects to focus on when training and imitating by maximizing its learning progress. We show that CLIC can effectively observe Bob to gain control of objects faster, even if Bob is not explicitly teaching. Despite choosing what it imitates in a principled way, CLIC retains the natural ability to follow Bob when he provides ordered demonstrations. Finally, we show that compared with a non-curriculum based agent, when Bob controls objects that the agent cannot, or in presence of a hierarchy between objects in the environment, CLIC achieves faster mastery of the environment by ignoring non-reproducible and already mastered interactions with objects when imitating.
Complete list of metadata

Contributor : Mohamed Chetouani Connect in order to contact the contributor
Submitted on : Tuesday, November 19, 2019 - 3:59:36 PM
Last modification on : Friday, April 1, 2022 - 5:16:09 PM

Links full text



Pierre Fournier, Cédric Colas, Mohamed Chetouani, Olivier Sigaud. CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments. IEEE Transactions on Cognitive and Developmental Systems, Institute of Electrical and Electronics Engineers, Inc, 2021, 13 (2), pp.239-248. ⟨10.1109/TCDS.2019.2933371⟩. ⟨hal-02370859⟩



Record views