Zap Meets Momentum: Stochastic Approximation Algorithms with Optimal Convergence Rate

Adithya M. Devraj 1 Ana Bušic 2, 3 Sean Meyn 1
2 DYOGENE - Dynamics of Geometric Networks
DI-ENS - Département d'informatique de l'École normale supérieure, CNRS - Centre National de la Recherche Scientifique : UMR 8548, Inria de Paris
Abstract : There are two well known Stochastic Approximation techniques that are known to have optimal rate of convergence (measured in terms of asymptotic variance): the Ruppert-Polyak averaging technique, and stochastic Newton-Raphson (SNR) (a matrix gain algorithm that resembles the deterministic Newton-Raphson method). The Zap algorithms introduced by the authors are a version of SNR designed to behave more closely like their deterministic cousin. It is found that estimates from the Zap Q-learning algorithm converge remarkably quickly, but the per-iteration complexity can be high. This paper introduces an entirely new class of stochastic approximation algorithms based on matrix momentum. For a special choice of the matrix momentum and gain sequences, it is found in simulations that the parameter estimates obtained from the algorithm couple with those obtained from the more complex stochastic Newton-Raphson algorithm. Conditions under which coupling is guaranteed are established for a class of linear recursions. Optimal finite-$n$ error bounds are also obtained. The main objective of this work is to create more efficient algorithms for applications to reinforcement learning. Numerical results illustrate the value of these techniques in this setting.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01968558
Contributor : Ana Busic <>
Submitted on : Wednesday, January 2, 2019 - 7:34:23 PM
Last modification on : Thursday, October 17, 2019 - 12:36:05 PM

Links full text

Identifiers

  • HAL Id : hal-01968558, version 1
  • ARXIV : 1809.06277

Citation

Adithya M. Devraj, Ana Bušic, Sean Meyn. Zap Meets Momentum: Stochastic Approximation Algorithms with Optimal Convergence Rate. 2018. ⟨hal-01968558⟩

Share

Metrics

Record views

53