Playout Policy Adaptation with Move Features

Tristan Cazenave

doi:10.1016/j.tcs.2016.06.024

Article Dans Une Revue Theoretical Computer Science Année : 2016

Playout Policy Adaptation with Move Features

(1)

Tristan Cazenave

Fonction : Auteur
PersonId : 743184
IdHAL : tristan-cazenave
ORCID : 0000-0003-4669-9374
IdRef : 076600289

Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision

Résumé

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We also propose to learn a policy not only using the moves but also according to the features of the moves. We test the resulting algorithms named Playout Policy Adaptation (PPA) and Playout Policy Adaptation with move Features (PPAF) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Knightthrough, MisereKnightthrough and Nogo. The experiments compare PPA and PPAF to Upper Confidence for Trees (UCT) and to the closely related Move-Average Sampling Technique (MAST) algorithm.

Mots clés

Computer Games Monte Carlo Tree Search Reinforcement Learning Playout policy Machine learning

Domaines

Informatique [cs]

Paris Dauphine-PSL Administrateur : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01436210

Soumis le : lundi 16 janvier 2017-11:43:37

Dernière modification le : vendredi 24 mars 2023-14:53:03

Dates et versions

hal-01436210 , version 1 (16-01-2017)

Identifiants

HAL Id : hal-01436210 , version 1
DOI : 10.1016/j.tcs.2016.06.024

Citer

Tristan Cazenave. Playout Policy Adaptation with Move Features. Theoretical Computer Science, 2016, 644, ⟨10.1016/j.tcs.2016.06.024⟩. ⟨hal-01436210⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-DAUPHINE LAMSADE-DAUPHINE PSL

60 Consultations

0 Téléchargements

Playout Policy Adaptation with Move Features

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager