J. Y. Audibert, Aggregated estimators and empirical complexity for least square regression, Ann. Inst. H. Poincaré Probab. Statist, vol.40, issue.6, pp.685-736, 2004.

J. Y. Audibert and A. B. Tsybakov, Fast learning rates for plug-in classifiers, Ann. Statist, vol.35, issue.2, pp.608-633, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00160849

P. L. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: risk bounds and structural results, J. Mach. Learn. Res. 3(Spec. Issue Comput. Learn. Theory), pp.463-482, 2002.

S. Bobkov and M. Ledoux, One-dimensional empirical measures, order statistics and Kantorovich transport distances, 2016.

E. Chzhen, C. Denis, and M. Hebiri, Minimax semi-supervised confidence sets for multi-class classification, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02112918

S. Conte and C. Boor, Elementary Numerical Analysis: An Algorithmic Approach, 1980.

K. Dembczynski, W. Kot?owski, O. Koyejo, and N. Natarajan, Consistency analysis for binary classification revisited, ICML, pp.961-969, 2017.

A. Dvoretzky, J. Kiefer, and J. Wolfowitz, Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator, Ann. Math. Statist, vol.27, issue.3, pp.642-669, 1956.

S. Keerthi, V. Sindhwani, and O. Chapelle, An efficient method for gradient-based adaptation of hyperparameters in svm models, NIPS, pp.673-680, 2007.

O. Koyejo, N. Natarajan, P. Ravikumar, and I. Dhillon, Consistent binary classification with generalized performance metrics, NIPS, pp.2744-2752, 2014.

D. Lewis, Evaluating and optimizing autonomous text classification systems, pp.246-254, 1995.

P. Massart, The tight constant in the dvoretzky-kiefer-wolfowitz inequality, Ann. Probab, vol.18, issue.3, pp.1269-1283, 1990.

P. Massart and ´. E. Nédélec, Risk bounds for statistical learning, Ann. Statist, vol.34, issue.5, pp.2326-2366, 2006.

A. Menon, H. Narasimhan, S. Agarwal, and S. Chawla, On the statistical consistency of algorithms for binary classification under class imbalance, In: ICML, vol.28, pp.603-611, 2013.

H. Narasimhan, R. Vaish, and S. Agarwal, On the statistical consistency of plug-in classifiers for nondecomposable performance measures, NIPS, pp.1493-1501, 2014.

P. Rigollet, Generalization error bounds in semi-supervised classification under the cluster assumption, Journal of Machine Learning Research, vol.8, pp.1369-1392, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00022528

P. Rigollet and R. Vert, Optimal rates for plug-in estimators of density level sets, Bernoulli, 2009.

A. Singh, R. Nowak, and J. Zhu, Unlabeled data: Now it helps, now it doesn't, NIPS, pp.1513-1520, 2009.

A. B. Tsybakov, Introduction to nonparametric estimation, Springer Series in Statistics, 2009.

S. Vallender, Calculation of the wasserstein distance between probability distributions on the line, Theory of Probability & Its Applications, vol.18, pp.784-786, 1974.

C. Van-rijsbergen, Foundation of evaluation, Journal of documentation, vol.30, issue.4, pp.365-373, 1974.

V. N. Vapnik, Statistical learning theory, 1998.

B. Yan, S. Koyejo, K. Zhong, and P. Ravikumar, Binary classification with karmic, threshold-quasiconcave metrics, In: ICML, vol.80, 2018.

Y. Yang, Minimax nonparametric classification: Rates of convergence, IEEE Transactions on Information Theory, vol.45, issue.7, pp.2271-2284, 1999.

N. Ye, K. Chai, W. Lee, and H. Chieu, Optimizing f-measures: A tale of two approaches, p.ICML, 2012.

M. J. Zhao, N. Edakunni, A. Pocock, and G. Brown, Beyond fano's inequality: bounds on the optimal f-score, ber, and cost-sensitive risk and their implications, JMLR, vol.14, pp.1033-1090, 2013.