Finite-sample analysis of M-estimators using self-concordance

Dmitrii M. Ostrovskii; Francis Bach

Pré-Publication, Document De Travail Année : 2020

Finite-sample analysis of M-estimators using self-concordance

(1) , (2, 3)

1
2
3

Dmitrii M. Ostrovskii

Fonction : Auteur
PersonId : 1030876

University of Southern California

Francis Bach

Fonction : Auteur

Département d'informatique - ENS Paris

Statistical Machine Learning and Parsimony

Résumé

The classical asymptotic theory for parametric $M$-estimators guarantees that, in the limit of infinite sample size, the excess risk has a chi-square type distribution, even in the misspecified case. We demonstrate how self-concordance of the loss allows to characterize the critical sample size sufficient to guarantee a chi-square type in-probability bound for the excess risk. Specifically, we consider two classes of losses: (i) self-concordant losses in the classical sense of Nesterov and Nemirovski, i.e., whose third derivative is uniformly bounded with the $3/2$ power of the second derivative; (ii) pseudo self-concordant losses, for which the power is removed. These classes contain losses corresponding to several generalized linear models, including the logistic loss and pseudo-Huber losses. Our basic result under minimal assumptions bounds the critical sample size by $O(d \cdot d_{\text{eff}}),$ where $d$ the parameter dimension and $d_{\text{eff}}$ the effective dimension that accounts for model misspecification. In contrast to the existing results, we only impose local assumptions that concern the population risk minimizer $\theta_*$. Namely, we assume that the calibrated design, i.e., design scaled by the square root of the second derivative of the loss, is subgaussian at $\theta_*$. Besides, for type-ii losses we require boundedness of a certain measure of curvature of the population risk at $\theta_*$. Our improved result bounds the critical sample size from above as $O(\max\{d_{\text{eff}}, d \log d\})$ under slightly stronger assumptions. Namely, the local assumptions must hold in the neighborhood of $\theta_*$ given by the Dikin ellipsoid of the population risk. Interestingly, we find that, for logistic regression with Gaussian design, there is no actual restriction of conditions: the subgaussian parameter and curvature measure remain near-constant over the Dikin ellipsoid. Finally, we extend some of these results to $\ell_1$-penalized estimators in high dimensions.

Domaines

Théorie [stat.TH] Machine Learning [stat.ML] Optimisation et contrôle [math.OC]

Fichier principal

main-self-concordant.pdf (1.69 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Dmitrii Ostrovskii : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01895127

Soumis le : lundi 30 novembre 2020-05:53:55

Dernière modification le : samedi 20 avril 2024-03:09:12

Dates et versions

hal-01895127 , version 1 (14-10-2018)

hal-01895127 , version 2 (15-11-2020)

hal-01895127 , version 3 (30-11-2020)

Identifiants

HAL Id : hal-01895127 , version 3
ARXIV : 1810.06838

Citer

Dmitrii M. Ostrovskii, Francis Bach. Finite-sample analysis of M-estimators using self-concordance. 2020. ⟨hal-01895127v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS UNIV-RENNES1 CNRS INRIA IRISA INRIA2 TDS-MACS PSL UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

3191 Consultations

244 Téléchargements

Finite-sample analysis of M-estimators using self-concordance

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager