Playout Policy Adaptation with Move Features - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Theoretical Computer Science Année : 2016

Playout Policy Adaptation with Move Features

Résumé

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We also propose to learn a policy not only using the moves but also according to the features of the moves. We test the resulting algorithms named Playout Policy Adaptation (PPA) and Playout Policy Adaptation with move Features (PPAF) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Knightthrough, MisereKnightthrough and Nogo. The experiments compare PPA and PPAF to Upper Confidence for Trees (UCT) and to the closely related Move-Average Sampling Technique (MAST) algorithm.

Dates et versions

hal-01436210 , version 1 (16-01-2017)

Identifiants

Citer

Tristan Cazenave. Playout Policy Adaptation with Move Features. Theoretical Computer Science, 2016, 644, ⟨10.1016/j.tcs.2016.06.024⟩. ⟨hal-01436210⟩
60 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More