Challenging the empirical mean and empirical variance: a deviation study

Olivier Catoni 1, 2
2 CLASSIC - Computational Learning, Aggregation, Supervised Statistical, Inference, and Classification
DMA - Département de Mathématiques et Applications - ENS Paris, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt
Abstract : We present new M-estimators of the mean and variance of real valued random variables, based on PAC-Bayes bounds. We analyze the non-asymptotic minimax properties of the deviations of those estimators for sample distributions having either a bounded variance or a bounded variance and a bounded kurtosis. Under those weak hypotheses, allowing for heavy-tailed distributions, we show that the worst case deviations of the empirical mean are suboptimal. We prove indeed that for any confidence level, there is some M-estimator whose deviations are of the same order as the deviations of the empirical mean of a Gaussian statistical sample, even when the statistical sample is instead heavy-tailed. Experiments reveal that these new estimators perform even better than predicted by our bounds, showing deviation quantile functions uniformly lower at all probability levels than the empirical mean for non Gaussian sample distributions as simple as the mixture of two Gaussian measures.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00517206
Contributor : Olivier Catoni <>
Submitted on : Monday, September 13, 2010 - 7:20:00 PM
Last modification on : Tuesday, April 2, 2019 - 2:15:32 PM

Links full text

Identifiers

  • HAL Id : hal-00517206, version 1
  • ARXIV : 1009.2048

Citation

Olivier Catoni. Challenging the empirical mean and empirical variance: a deviation study. 2010. ⟨hal-00517206⟩

Share

Metrics

Record views

269