Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling

Stéphan Clémençon; Sylvain Robbiano; Jessica Tressou

Pré-Publication, Document De Travail Année : 2013

Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling

(1) , (1) , (2)

1
2

Stéphan Clémençon

Fonction : Auteur
PersonId : 174491
IdHAL : stephan-clemencon
ORCID : 0000-0002-5879-9500
IdRef : 08905203X

Laboratoire Traitement et Communication de l'Information

Sylvain Robbiano

Fonction : Auteur
PersonId : 911882

Laboratoire Traitement et Communication de l'Information

Jessica Tressou

Fonction : Auteur
PersonId : 14614
IdHAL : jessica-tressou
ORCID : 0000-0002-5698-9837
IdRef : 103568611

Méthodologies d'Analyse de Risque Alimentaire

Résumé

It is the goal of this paper to extend the \textit{Empirical Risk Minimization} (ERM) paradigm, from a practical perspective, to the situation where a natural estimate of the risk is of the form of a $K$-sample $U$-statistics, as it is the case in the $K$-partite ranking problem for instance. Indeed, the numerical computation of the empirical risk is hardly feasible if not infeasible, even for moderate samples sizes. Precisely, it involves averaging $O(n^{d_1+\ldots+d_K})$ terms, when considering a $U$-statistic of degrees $(d_1,\;\ldots,\; d_K)$ based on samples of sizes proportional to $n$. We propose here to consider a drastically simpler Monte-Carlo version of the empirical risk based on $O(n)$ terms solely, which can be viewed as an \textit{incomplete generalized $U$-statistic}, and prove that, remarkably, the approximation stage does not damage the ERM procedure and yields a learning rate of order $O_{\mathbb{P}}(1/\sqrt{n})$. Beyond a theoretical analysis guaranteeing the validity of this approach, numerical experiments are displayed for illustrative purpose.

Mots clés

Empirical risk minimization risk sampling incomplete $U$-statistics ranking minimum-volume set

Domaines

Machine Learning [stat.ML]

Fichier principal

SIAM_DM13-1.pdf (332.47 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Sylvain Robbiano : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00809487

Soumis le : mardi 9 avril 2013-14:22:57

Dernière modification le : mardi 12 mars 2024-10:44:43

Archivage à long terme le : lundi 3 avril 2017-02:44:53

Dates et versions

hal-00809487 , version 1 (09-04-2013)

Identifiants

HAL Id : hal-00809487 , version 1

Citer

Stéphan Clémençon, Sylvain Robbiano, Jessica Tressou. Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling. 2013. ⟨hal-00809487⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS INRA PARISTECH MIA-PARIS LTCI IDS S2A INRAE MATHNUM

214 Consultations

566 Téléchargements

Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager