Finite-sample analysis of Lasso-TD

Mohammad Ghavamzadeh 1 Alessandro Lazaric 1 Rémi Munos 1 Matt Hoffman 1, 2
1 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal, Inria Lille - Nord Europe
Abstract : In this paper, we analyze the performance of Lasso-TD, a modification of LSTD in which the projection operator is defined as a Lasso problem. We first show that Lasso-TD is guaranteed to have a unique fixed point and its algorithmic implementation coincides with the recently presented LARS-TD and LC-TD methods. We then derive two bounds on the prediction error of Lasso-TD in the Markov design setting, i.e., when the perfor- mance is evaluated on the same states used by the method. The first bound makes no as- sumption, but has a slow rate w.r.t. the number of samples. The second bound is under an assumption on the empirical Gram matrix, called the compatibility condition, but has an improved rate and directly relates the prediction error to the sparsity of the value function in the feature space at hand.
Document type :
Conference papers
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00830149
Contributor : Rémi Munos <>
Submitted on : Tuesday, June 4, 2013 - 3:47:57 PM
Last modification on : Thursday, February 21, 2019 - 10:52:49 AM
Long-term archiving on : Thursday, September 5, 2013 - 4:22:44 AM

File

lasso-TD.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00830149, version 1

Collections

Citation

Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos, Matt Hoffman. Finite-sample analysis of Lasso-TD. International Conference on Machine Learning, 2011, United States. ⟨hal-00830149⟩

Share

Metrics

Record views

594

Files downloads

255