Some non-asymptotic results on resampling in high dimension, I: Confidence regions, II: Multiple tests

Abstract : We study generalized bootstrap confidence regions for the mean of a random vector whose coordinates have an unknown dependency structure. The random vector is supposed to be either Gaussian or to have a symmetric and bounded distribution. The dimensionality of the vector can possibly be much larger than the number of observations and we focus on a non-asymptotic control of the confidence level, following ideas inspired by recent results in learning theory. We consider two approaches, the first based on a concentration principle (valid for a large class of resampling weights) and the second on a direct resampled quantile, specifically using Rademacher weights. Several intermediate results established in the approach based on concentration principles are of self-interest. We also discuss the question of accuracy when using Monte-Carlo approximations of the resampled quantities. We present an application of these results to the one-sided and two-sided multiple testing problem, in which we derive several resampling-based step-down procedures providing a non-asymptotic FWER control. We compare our different procedures in a simulation study, and we show that they can outperform Bonferroni's or Holm's procedures as soon as the observed vector has sufficiently correlated coordinates.
Document type :
Journal articles
The Annals of Statistics, IMS, 2010, 38 (1), pp.51-99
Liste complète des métadonnées


https://hal.archives-ouvertes.fr/hal-00194145
Contributor : Sylvain Arlot <>
Submitted on : Monday, July 6, 2009 - 11:25:18 AM
Last modification on : Monday, May 29, 2017 - 2:23:41 PM
Document(s) archivé(s) le : Wednesday, September 22, 2010 - 12:45:11 PM

Files

ABR09_1_RC.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00194145, version 2
  • ARXIV : 0712.0775

Collections

Citation

Sylvain Arlot, Gilles Blanchard, Etienne Roquain. Some non-asymptotic results on resampling in high dimension, I: Confidence regions, II: Multiple tests. The Annals of Statistics, IMS, 2010, 38 (1), pp.51-99. <hal-00194145v2>

Share

Metrics

Record views

486

Document downloads

460