The arcade learning environment : An evaluation platform for general agents (extended abstract), Proceedings of the 3rd International Conference on Development and Learning, pp.1471-1479, 2004. ,
Structure and direction in thinking, Weight uncertainty in neural networks, 1965. ,
Maximization of potential information flow as a universal utility for collective behaviour, International Conference on Learning Representations, vol.3, pp.207-213, 2002. ,
Learning navigation behaviors end-to-end with autorl, Advances in Neural Information Processing Systems, vol.4, pp.2007-2014, 2012. ,
A unified strategy for implementing curiosity and empowerment driven reinforcement learning, Foundations and Trends R in Robotics, vol.2, issue.1-2, pp.1-142, 2013. ,
Lior Fox, Leshem Choshen, and Yonatan Loewenstein. DORA the explorer : Directed outreaching reinforcement action-selection, Carlos Florensa, Jonas Degrave, Nicolas Heess, Jost Tobias Springenberg, and Martin Riedmiller. Self-supervised learning of image embedding for continuous control, vol.15, pp.1514-1523, 2014. ,
Curiosity driven reinforcement learning for motion planning on humanoids, Foundations and Trends R in Machine Learning, vol.11, issue.3-4, p.25, 2014. ,
, Bayesian reinforcement learning : A survey. Foundations and Trends R in Machine Learning, Filip De Turck, and Pieter Abbeel. Vime : Variational information maximizing exploration, vol.2, pp.1109-1117, 1999.
, , 2018.
Information thermodynamics on causal networks and its application to biochemical signal transduction, Inequity aversion resolves intertemporal social dilemmas, 2016. ,
Bayesian surprise attracts human attention, Advances in neural information processing systems, pp.547-554, 2006. ,
Intrinsic social motivation via causal influence in multi-agent rl, 2018. ,
, Unsupervised realtime control through variational empowerment, 2017.
Vizdoom : A doom-based ai research platform for visual reinforcement learning, 2016 IEEE Conference on Computational Intelligence and Games (CIG), vol.49, pp.1-8, 2002. ,
Auto-encoding variational bayes, 2013. ,
Overcoming catastrophic forgetting in neural networks, Proceedings of the national academy of sciences, vol.114, pp.3521-3526, 2017. ,
Empowerment : A universal agent-centric measure of control, The 2005 IEEE Congress on, vol.1, pp.128-135, 2005. ,
Hierarchical deep reinforcement learning : Integrating temporal abstraction and intrinsic motivation, Advances in neural information processing systems, pp.3675-3683, 2016. ,
Option discovery in hierarchical reinforcement learning using spatio-temporal clustering, Deep successor reinforcement learning, pp.329-336, 2008. ,
, , 2017.
, International Foundation for Autonomous Agents and Multiagent Systems, Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp.464-473, 2017.
, Hierarchical reinforcement learning with hindsight, International Conference on Learning Representations, 2019.
Learning and exploration in action-perception loops, Continuous control with deep reinforcement learning, vol.7, p.37, 2013. ,
, Exploration in model-based reinforcement learning by empirically estimating learning progress, Proceedings of the 34th International Conference on Machine Learning, vol.45, pp.2295-2304, 2012.
Shakir Mohamed and Danilo Jimenez Rezende. Variational information maximisation for intrinsically motivated reinforcement learning, Changjae Oh and Andrea Cavallaro. Learning action representations for self-supervised visual exploration, vol.15, pp.278-287, 1989. ,
, Actionconditional video prediction using deep networks in atari games, Proceedings of the 8th International Conference on Epigenetic Robotics : Modeling Cognitive Development in Robotic Systems, vol.1, pp.492-502, 2008.
Unsupervised methods for subgoal discovery during intrinsic motivation in model-free hierarchical reinforcement learning, Proceedings of the 35th International Conference on Machine Learning, ICML 2018, vol.2017, pp.1312-1320, 1952. ,
Driven by compression progress : A simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes, 2011 IEEE International Conference on Development and Learning (ICDL), vol.2, pp.70-82, 1991. ,
Incentivizing exploration in reinforcement learning with deep predictive models, New York : Appleton, 1938. ,
Still and Precup, 2012] Susanne Still and Doina Precup. An information-theoretic approach to curiositydriven reinforcement learning, Theory in Biosciences, vol.131, issue.3, pp.139-148, 2012. ,
, Between mdps and semi-mdps : A framework for temporal abstraction in reinforcement learning, Reward shaping with recurrent neural networks for speeding up on-line policy learning in spoken dialogue systems, vol.112, pp.3540-3549, 1998.
Unsupervised control through non-parametric discriminative rewards, 2015 IEEE congress on evolutionary computation (CEC), vol.66, pp.715-770, 1959. ,
, Vision-based robot navigation through combining unsupervised learning and hierarchical reinforcement learning, Scheduled intrinsic drive : A hierarchical take on intrinsically motivated exploration, vol.26, p.1576, 2015.