MONK -- Outlier-Robust Mean Embedding Estimation by Median-of-Means - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

MONK -- Outlier-Robust Mean Embedding Estimation by Median-of-Means

Résumé

Mean embeddings provide an extremely flexible and powerful tool in machine learning and statistics to represent probability distributions and define a semi-metric (MMD, maximum mean discrepancy; also called N-distance or energy distance), with numerous successful applications. The representation is constructed as the expectation of the feature map defined by a kernel. As a mean, its classical empirical estimator, however, can be arbitrary severely affected even by a single outlier in case of unbounded features. To the best of our knowledge, unfortunately even the consistency of the existing few techniques trying to alleviate this serious sensitivity bottleneck is unknown. In this paper, we show how the recently emerged principle of median-of-means can be used to design estimators for kernel mean embedding and MMD with excessive resistance properties to outliers, and optimal sub-Gaussian deviation bounds under mild assumptions.
Fichier principal
Vignette du fichier
ICML-2019_MONK.pdf (758.48 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01705881 , version 1 (09-02-2018)
hal-01705881 , version 2 (13-02-2018)
hal-01705881 , version 3 (15-02-2018)
hal-01705881 , version 4 (17-10-2018)
hal-01705881 , version 5 (15-05-2019)

Identifiants

  • HAL Id : hal-01705881 , version 5

Citer

Matthieu Lerasle, Zoltán Szabó, Timothée Mathieu, Guillaume Lecué. MONK -- Outlier-Robust Mean Embedding Estimation by Median-of-Means. ICML 2019 - 36th International Conference on Machine Learning, Jun 2019, Long Beach, United States. ⟨hal-01705881v5⟩
813 Consultations
672 Téléchargements

Partager

Gmail Facebook X LinkedIn More