Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions

Grégoire Mialon 1, 2 Alexandre d'Aspremont 2 Julien Mairal 1
1 Thoth - Apprentissage de modèles à partir de données massives
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann
2 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, CNRS - Centre National de la Recherche Scientifique, Inria de Paris
Abstract : We design simple screening tests to automatically discard data samples in empirical risk minimization without losing optimization guarantees. We derive loss functions that produce dual objectives with a sparse solution. We also show how to regularize convex losses to ensure such a dual sparsity-inducing property, and propose a general method to design screening tests for classification or regression based on ellipsoidal approximations of the optimal set. In addition to producing computational gains, our approach also allows us to compress a dataset into a subset of representative points.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Cited literature [26 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02395624
Contributor : Grégoire Mialon <>
Submitted on : Thursday, December 5, 2019 - 3:45:06 PM
Last modification on : Thursday, December 12, 2019 - 9:10:32 AM

File

main_safe_samples.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02395624, version 1

Collections

Citation

Grégoire Mialon, Alexandre d'Aspremont, Julien Mairal. Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions. 2019. ⟨hal-02395624⟩

Share

Metrics

Record views

757

Files downloads

29