Skip to Main content Skip to Navigation
Conference papers

Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning

Abstract : This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. 1) we define a specific class of MDPs that we call Non-Stationary MDPs (NSMDPs). We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t. time; 2) we consider a planning agent using the current model of the environment but unaware of its future evolution. This leads us to consider a worst-case method where the environment is seen as an adversarial agent; 3) following this approach, we propose the Risk-Averse Tree-Search (RATS) algorithm, a zero-shot Model-Based method similar to Minimax search; 4) we illustrate the benefits brought by RATS empirically and compare its performance with reference Model-Based algorithms.
Complete list of metadata

Cited literature [40 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02882205
Contributor : Open Archive Toulouse Archive Ouverte (oatao) <>
Submitted on : Friday, June 26, 2020 - 3:08:12 PM
Last modification on : Tuesday, March 16, 2021 - 3:20:07 PM

File

Lecarpentier_24342.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02882205, version 1
  • OATAO : 24342

Collections

Citation

Erwan Lecarpentier, Emmanuel Rachelson. Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning. NeurIPS, Dec 2019, Vancouver, Canada. pp.1-18. ⟨hal-02882205⟩

Share

Metrics

Record views

47

Files downloads

147