Skip to Main content Skip to Navigation

Reinforcement Learning Approaches in Dynamic Environments

Miyoung Han 1, 2 
2 VALDA - Value from Data
DI-ENS - Département d'informatique - ENS Paris, Inria de Paris
Abstract : Reinforcement learning is learning from interaction with an environment to achieve a goal. It is an efficient framework to solve sequential decision-making problems, using Markov decision processes (MDPs) as a general problem formulation. In this thesis, we apply reinforcement learning to sequential decision-making problems in dynamic environments. We first present an algorithm based on Q-learning with a customized exploration and exploitation strategy to solve a real taxi routing problem. Our algorithm is able to progressively learn optimal actions for routing an autonomous taxi to passenger pick-up points. Then, we address the factored MDP problem in a non-deterministic setting. We propose an algorithm that learns transition functions using the Dynamic Bayesian Network formalism. We demonstrate that factorization methods allow to efficiently learn correct models; through the learned models, the agent can accrue higher cumulative rewards. We extend our work to very large domains. In the focused crawling problem, we propose a new scoring mechanism taking into account long-term effects of selecting a link, and present new feature representations of states for Web pages and actions for next link selection. This approach allowed us to improve on the efficiency of focused crawling. In the influence maximization (IM) problem, we extend the classical IM problem with incomplete knowledge of graph structure and topic-based user interest. Our algorithm finds the most influential seeds to maximize topic-based influence by learning action values for each probed node.
Complete list of metadata

Cited literature [110 references]  Display  Hide  Download
Contributor : Pierre Senellart Connect in order to contact the contributor
Submitted on : Wednesday, October 10, 2018 - 7:47:35 AM
Last modification on : Wednesday, June 15, 2022 - 8:45:07 PM
Long-term archiving on: : Friday, January 11, 2019 - 12:33:14 PM


Files produced by the author(s)


  • HAL Id : tel-01891805, version 1


Miyoung Han. Reinforcement Learning Approaches in Dynamic Environments. Databases [cs.DB]. Télécom ParisTech, 2018. English. ⟨tel-01891805⟩



Record views


Files downloads