On the tactical and strategic behaviour of MCTS when biasing random simulations

Fabien Teytaud; Julien Dehos

Article Dans Une Revue International Computer Games Association Journal Année : 2015

On the tactical and strategic behaviour of MCTS when biasing random simulations

(1) , (1)

Fabien Teytaud

Fonction : Auteur
PersonId : 10819
IdHAL : fabien-teytaud
IdRef : 15660213X

Laboratoire d'Informatique Signal et Image de la Côte d'Opale

Julien Dehos

Fonction : Auteur
PersonId : 19990
IdHAL : julien-dehos
ORCID : 0000-0002-4049-2551
IdRef : 150861745

Laboratoire d'Informatique Signal et Image de la Côte d'Opale

Résumé

Over the last few years, many new algorithms have been proposed to solve combinatorial problems. In this field, Monte-Carlo Tree Search (MCTS) is a generic method which performs really well on several applications; for instance, it has been used with notable results in the game of Go. To find the most promising decision, MCTS builds a search tree where the new nodes are selected by sampling the search space randomly (i.e., by Monte-Carlo simulations). However, efficient Monte-Carlo policies are generally difficult to learn. Even if an improved Monte-Carlo policy performs adequately in some games, it can become useless or harmful in other games depending on how the algorithm takes into account the tactical and the strategic elements of the game. In this article, we address this problem by studying when and why a learned Monte-Carlo policy works. To this end, we use (1) two known Monte-Carlo policy improvements (PoolRave and Last-Good-Reply) and (2) two connection games (Hex and Havannah). We aim to understand how the benefit is related (a) to the number of random simulations and (b) to the various game rules (within them, tactical and strategic elements of the game). Our results indicate that improved Monte-Carlo policies, such as PoolRave or Last-Good-Reply, work better for games with a strong tactical element for small numbers of random simulations, whereas more general policies seem to be more suited for games with a strong strategic element for higher numbers of random simulations.

Mots clés

monte carlo tree search reinforcement learning connexion games

Domaines

Intelligence artificielle [cs.AI] Apprentissage [cs.LG]

Fichier principal

Teytaud_2015_ICGA_manuscript.pdf (610.63 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Julien Dehos : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01267056

Soumis le : mercredi 3 février 2016-18:24:52

Dernière modification le : jeudi 25 novembre 2021-08:22:29

Archivage à long terme le : samedi 12 novembre 2016-07:07:24

Dates et versions

hal-01267056 , version 1 (03-02-2016)

Identifiants

HAL Id : hal-01267056 , version 1

Citer

Fabien Teytaud, Julien Dehos. On the tactical and strategic behaviour of MCTS when biasing random simulations. International Computer Games Association Journal, 2015, 38 (2), pp.67-80. ⟨hal-01267056⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LITTORAL LISIC

96 Consultations

580 Téléchargements

On the tactical and strategic behaviour of MCTS when biasing random simulations

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager