Finite-sample analysis of Lasso-TD

Mohammad Ghavamzadeh; Alessandro Lazaric; Rémi Munos; Matt Hoffman

Communication Dans Un Congrès Année : 2011

Finite-sample analysis of Lasso-TD

(1) , (1) , (1) , (1, 2)

1
2

Mohammad Ghavamzadeh

Fonction : Auteur
PersonId : 868946

Sequential Learning

Alessandro Lazaric

Fonction : Auteur
PersonId : 851
IdHAL : alessandro-lazaric
ORCID : 0000-0002-8970-413X
IdRef : 188701486

Sequential Learning

Rémi Munos

Fonction : Auteur
PersonId : 836863

Sequential Learning

Matt Hoffman

Fonction : Auteur

Sequential Learning

Department of Computing Science [Edmonton]

Résumé

In this paper, we analyze the performance of Lasso-TD, a modification of LSTD in which the projection operator is defined as a Lasso problem. We first show that Lasso-TD is guaranteed to have a unique fixed point and its algorithmic implementation coincides with the recently presented LARS-TD and LC-TD methods. We then derive two bounds on the prediction error of Lasso-TD in the Markov design setting, i.e., when the perfor- mance is evaluated on the same states used by the method. The first bound makes no as- sumption, but has a slow rate w.r.t. the number of samples. The second bound is under an assumption on the empirical Gram matrix, called the compatibility condition, but has an improved rate and directly relates the prediction error to the sparsity of the value function in the feature space at hand.

Domaines

Apprentissage [cs.LG]

Fichier principal

lasso-TD.pdf (223.14 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Rémi Munos : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00830149

Soumis le : mardi 4 juin 2013-15:47:57

Dernière modification le : vendredi 24 mars 2023-14:52:57

Archivage à long terme le : jeudi 5 septembre 2013-04:22:44

Dates et versions

hal-00830149 , version 1 (04-06-2013)

Identifiants

HAL Id : hal-00830149 , version 1

Citer

Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos, Matt Hoffman. Finite-sample analysis of Lasso-TD. International Conference on Machine Learning, 2011, United States. ⟨hal-00830149⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LILLE3 CNRS INRIA LAGIS INRIA2

300 Consultations

255 Téléchargements

Finite-sample analysis of Lasso-TD

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager