Zap Q-Learning for Optimal Stopping

Shuhang Chen; Adithya Devraj; Ana Bušić; Sean Meyn

doi:10.23919/ACC45564.2020.9147481

Communication Dans Un Congrès Année : 2020

Zap Q-Learning for Optimal Stopping

(1) , (2) , (3, 4) , (2)

1
2
3
4

Shuhang Chen

Fonction : Auteur

University of Florida [Gainesville]

Adithya Devraj

Fonction : Auteur

Department of Electrical and Computer Engineering [Gainesville]

Ana Bušić

Fonction : Auteur
PersonId : 2602
IdHAL : anabusic
ORCID : 0000-0002-4133-3739
IdRef : 144488175

Dynamics of Geometric Networks

Laboratory of Information, Network and Communication Sciences

Sean Meyn

Fonction : Auteur

Department of Electrical and Computer Engineering [Gainesville]

Résumé

This paper concerns approximate solutions to the optimal stopping problem for a geometrically ergodic Markov chain on a continuous state space. The starting point is the Galerkin relaxation of the dynamic programming equations that was introduced by Tsitsikilis and Van Roy in the 1990s, which motivated their Q-learning algorithm for optimal stopping. It is known that the convergence rate of Q-learning is in many cases very slow. The reason for slow convergence is explained here, along with a variant of "Zap-Q-learning" algorithm, designed to achieve the optimal rate of convergence. The main contribution is to establish consistency of Zap-Qlearning algorithm for a linear function approximation setting. The theoretical results are illustrated using an example from finance.

Domaines

Optimisation et contrôle [math.OC] Energie électrique Probabilités [math.PR] Systèmes et contrôle [cs.SY]

Ana Busic : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03094388

Soumis le : lundi 4 janvier 2021-12:12:25

Dernière modification le : lundi 11 décembre 2023-11:31:17

Dates et versions

hal-03094388 , version 1 (04-01-2021)

Identifiants

HAL Id : hal-03094388 , version 1
DOI : 10.23919/ACC45564.2020.9147481

Citer

Shuhang Chen, Adithya Devraj, Ana Bušić, Sean Meyn. Zap Q-Learning for Optimal Stopping. ACC 2020 - American Control Conference, Jul 2020, Denver / Virtual, United States. pp.3920-3925, ⟨10.23919/ACC45564.2020.9147481⟩. ⟨hal-03094388⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM ENS-PARIS CNRS INRIA INSMI INRIA2 TDS-MACS PSL SORBONNE-UNIVERSITE ANR

31 Consultations

0 Téléchargements

Zap Q-Learning for Optimal Stopping

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager