No fast exponential deviation inequalities for the progressive mixture rule
Résumé
We consider the learning task consisting in predicting as well as the best function in a finite reference set G up to the smallest possible additive term. If R(g) denotes the generalization error of a prediction function g, under reasonable assumptions on the loss function (typically satisfied by the least square loss when the output is bounded), it is known that the progressive mixture rule g_n satisfies E R(g_n) < min_{g in G} R(g) + C (log|G|)/n where n denotes the size of the training set, E denotes the expectation w.r.t. the training set distribution and C denotes a positive constant. This work mainly shows that for any training set size n, there exist a>0, a reference set G and a probability distribution generating the data such that with probability at least a R(g_n) > min_{g in G} R(g) + c sqrt{[log(|G|/a)]/n}, where c is a positive constant. In other words, surprisingly, for appropriate reference set G, the deviation convergence rate of the progressive mixture rule is only of order 1/sqrt{n} while its expectation convergence rate is of order 1/n. The same conclusion holds for the progressive indirect mixture rule. This work also emphasizes on the suboptimality of algorithms based on penalized empirical risk minimization on G.
Origine : Fichiers produits par l'(les) auteur(s)
Loading...