Sparse classification boundaries

Yuri I. Ingster; Christophe Pouet; Alexandre B. Tsybakov

Pré-Publication, Document De Travail Année : 2009

Sparse classification boundaries

(1) , (2) , (3, 4)

1
2
3
4

Yuri I. Ingster

Fonction : Auteur

Saint Petersburg Electrotechnical University

Christophe Pouet

Fonction : Auteur
PersonId : 837501

Laboratoire d'Analyse, Topologie, Probabilités

Alexandre B. Tsybakov

Fonction : Auteur

Laboratoire de Probabilités et Modèles Aléatoires

Centre de Recherche en Économie et Statistique

Résumé

Given a training sample of size $m$ from a $d$-dimensional population, we wish to allocate a new observation $Z\in \R^d$ to this population or to the noise. We suppose that the difference between the distribution of the population and that of the noise is only in a shift, which is a sparse vector. For the Gaussian noise, fixed sample size $m$, and the dimension $d$ that tends to infinity, we obtain the sharp classification boundary and we propose classifiers attaining this boundary. We also give extensions of this result to the case where the sample size $m$ depends on $d$ and satisfies the condition $(\log m)/\log d \to \gamma$, $0\le \gamma<1$, and to the case of non-Gaussian noise satisfying the Cramér condition.

Mots clés

Bayes risk classification boundary high-dimensional data optimal classifier sparse vectors

Domaines

Statistiques [math.ST] Théorie [stat.TH]

Fichier principal

IngsterPouetTsybakov2009.pdf (328.73 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Christophe Pouet : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00371237

Soumis le : vendredi 27 mars 2009-09:24:47

Dernière modification le : mercredi 17 avril 2024-13:46:27

Archivage à long terme le : jeudi 10 juin 2010-18:54:42

Dates et versions

hal-00371237 , version 1 (27-03-2009)

Identifiants

HAL Id : hal-00371237 , version 1
ARXIV : 0903.4807

Citer

Yuri I. Ingster, Christophe Pouet, Alexandre B. Tsybakov. Sparse classification boundaries. 2009. ⟨hal-00371237⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-PARIS7 X UPMC GENES PMA LATP CNRS UNIV-AMU ENSAE PARISTECH CREST ENSAI I2M LPSM X-CREST SORBONNE-UNIVERSITE SU-SCIENCES

274 Consultations

183 Téléchargements

Sparse classification boundaries

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager