On the tactical and strategic behaviour of MCTS when biasing random simulations - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue International Computer Games Association Journal Année : 2015

On the tactical and strategic behaviour of MCTS when biasing random simulations

Résumé

Over the last few years, many new algorithms have been proposed to solve combinatorial problems. In this field, Monte-Carlo Tree Search (MCTS) is a generic method which performs really well on several applications; for instance, it has been used with notable results in the game of Go. To find the most promising decision, MCTS builds a search tree where the new nodes are selected by sampling the search space randomly (i.e., by Monte-Carlo simulations). However, efficient Monte-Carlo policies are generally difficult to learn. Even if an improved Monte-Carlo policy performs adequately in some games, it can become useless or harmful in other games depending on how the algorithm takes into account the tactical and the strategic elements of the game. In this article, we address this problem by studying when and why a learned Monte-Carlo policy works. To this end, we use (1) two known Monte-Carlo policy improvements (PoolRave and Last-Good-Reply) and (2) two connection games (Hex and Havannah). We aim to understand how the benefit is related (a) to the number of random simulations and (b) to the various game rules (within them, tactical and strategic elements of the game). Our results indicate that improved Monte-Carlo policies, such as PoolRave or Last-Good-Reply, work better for games with a strong tactical element for small numbers of random simulations, whereas more general policies seem to be more suited for games with a strong strategic element for higher numbers of random simulations.
Fichier principal
Vignette du fichier
Teytaud_2015_ICGA_manuscript.pdf (610.63 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01267056 , version 1 (03-02-2016)

Identifiants

  • HAL Id : hal-01267056 , version 1

Citer

Fabien Teytaud, Julien Dehos. On the tactical and strategic behaviour of MCTS when biasing random simulations. International Computer Games Association Journal, 2015, 38 (2), pp.67-80. ⟨hal-01267056⟩
96 Consultations
580 Téléchargements

Partager

Gmail Facebook X LinkedIn More