Towards a Reinforcement Learning Module for Navigation in Video Games - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2005

Towards a Reinforcement Learning Module for Navigation in Video Games

Thierry Gourdin
  • Fonction : Auteur
  • PersonId : 1004295
Olivier Sigaud

Résumé

Large real-world Probabilistic Temporal Planning (PTP) is a very challenging research field. A common approach is to model such problems as Markov Decision Problems (MDP) and use dynamic programming techniques. Yet, two major difficulties arise: 1- dynamic programming does not scale with the number of tasks, and 2- the probabilistic model may be uncertain, leading to the choice of unsafe policies. We build here on the Factored Policy Gradient (FPG) algorithm and on robust decision-making to address both difficulties through an algorithm that trains two competing teams of learning agents. As the learning is simultaneous, each agent is facing a non-stationary environment. The goal is for them to find a common Nash equilibrium.
Fichier non déposé

Dates et versions

hal-01491687 , version 1 (17-03-2017)

Identifiants

  • HAL Id : hal-01491687 , version 1

Citer

Thierry Gourdin, Olivier Sigaud. Towards a Reinforcement Learning Module for Navigation in Video Games. ECML 2005 Workshop on Reinforcement Learning in Non-Stationary Environments, Oct 2005, Porto, Portugal. pp.1-12. ⟨hal-01491687⟩
49 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More