Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2013

Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling

Résumé

It is the goal of this paper to extend the \textit{Empirical Risk Minimization} (ERM) paradigm, from a practical perspective, to the situation where a natural estimate of the risk is of the form of a $K$-sample $U$-statistics, as it is the case in the $K$-partite ranking problem for instance. Indeed, the numerical computation of the empirical risk is hardly feasible if not infeasible, even for moderate samples sizes. Precisely, it involves averaging $O(n^{d_1+\ldots+d_K})$ terms, when considering a $U$-statistic of degrees $(d_1,\;\ldots,\; d_K)$ based on samples of sizes proportional to $n$. We propose here to consider a drastically simpler Monte-Carlo version of the empirical risk based on $O(n)$ terms solely, which can be viewed as an \textit{incomplete generalized $U$-statistic}, and prove that, remarkably, the approximation stage does not damage the ERM procedure and yields a learning rate of order $O_{\mathbb{P}}(1/\sqrt{n})$. Beyond a theoretical analysis guaranteeing the validity of this approach, numerical experiments are displayed for illustrative purpose.
Fichier principal
Vignette du fichier
SIAM_DM13-1.pdf (332.47 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00809487 , version 1 (09-04-2013)

Identifiants

  • HAL Id : hal-00809487 , version 1

Citer

Stéphan Clémençon, Sylvain Robbiano, Jessica Tressou. Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling. 2013. ⟨hal-00809487⟩
214 Consultations
566 Téléchargements

Partager

Gmail Facebook X LinkedIn More