Now you see me: finding the right observation space to learn diverse behaviours by reinforcement in games

Training virtual agents to play a game using reinforcement learning (RL) has gained a lot of traction in recent years. Indeed, RL has delivered agents with superhuman performances on multiple gameplays. Yet, from a human-machine interaction standpoint, raw performance is not the only dimension of a "good" game AI. Exhibiting diversified behaviours is key to generate novelty, one of the core components of player engagement. In the RL framework, teaching agents to discover multiple strategies to achieve the same task is often framed as skill discovery. However, we observe that the current RL literature defines diversity as the exploration of different states, i.e. the incentive of the agent to "see" new observations. In this work, we argue that this definition does not make sense from a gameplay point of view. Instead, diversity should be defined as a distance on observations from an observer, external to the agent. We illustrate how DIAYN/SMERL, state of the art RL algorithms for skill discovery, fail to discover meaningful behaviours in a simple tag game. We propose an easy fix by introducing the notion of diversity spaces, defined as the observations gathered by a third-party external to the agent.

Mots clés

reinforcement learning video games diversity

Domaines

Réseau de neurones [cs.NE] Intelligence artificielle [cs.AI] Apprentissage [cs.LG]

Fichier principal

CAp2022_paper_0257.pdf (1.41 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Nicolas Audebert : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03678280

Soumis le : mercredi 25 mai 2022-11:45:13

Dernière modification le : mercredi 5 avril 2023-04:01:03

Dates et versions

hal-03678280 , version 1 (25-05-2022)

Identifiants

HAL Id : hal-03678280 , version 1

Citer

Raphaël Boige, Nicolas Audebert, Clément Rambour, Guillaume Levieux. Now you see me: finding the right observation space to learn diverse behaviours by reinforcement in games. Conférence sur l'Apprentissage automatique (CAp), Jul 2022, Vannes, France. ⟨hal-03678280⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNAM CEDRIC-CNAM HESAM

177 Consultations

84 Téléchargements