C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, vol.1, issue.3, pp.273-297, 1995.
DOI : 10.1007/BF00994018

K. Crammer and Y. Singer, On the algorithmic implementation of multiclass kernel-based vector machines, J. Mach. Learn. Res, vol.2, pp.265-392, 2001.

I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, Learning realistic human actions from movies, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.23-28, 2008.
DOI : 10.1109/CVPR.2008.4587756

URL : https://hal.archives-ouvertes.fr/inria-00548659

D. Martín-iglesias, J. Bernal-chaves, C. Peláez-moreno, A. Gallardo-antolín, and F. Díaz-de-maría, A Speech Recognizer Based on Multiclass SVMs with HMM-Guided Segmentation, Nonlinear Analyses and Algorithms for Speech Processing, pp.257-266, 2005.
DOI : 10.1007/11613107_22

I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun, Large margin methods for structured and interdependent output variables, J. Mach. Learn. Res, vol.6, pp.1453-1484, 2005.

T. M. Cover, Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition, IEEE Transactions on Electronic Computers, vol.14, issue.3, pp.326-334, 1965.
DOI : 10.1109/PGEC.1965.264137

M. Aizerman, E. Braverman, and L. Rozonoer, Theoretical foundations of the potential function method in pattern recognition learning, Automation and Remote Control, vol.25, pp.821-837, 1964.

T. Joachims, T. Finley, and C. J. Yu, Cutting-plane training of structural SVMs, Machine Learning, vol.6, issue.2, pp.27-59, 2009.
DOI : 10.1007/s10994-009-5108-8

URL : https://link.springer.com/content/pdf/10.1007%2Fs10994-009-5108-8.pdf

F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, Optimization with Sparsity-Inducing Penalties, Foundations and Trends?? in Machine Learning, vol.4, issue.1, pp.1-106, 2012.
DOI : 10.1561/2200000015

URL : https://hal.archives-ouvertes.fr/hal-00613125

A. Jalali, P. Ravikumar, S. Sanghavi, and C. Ruan, A dirty model for multi-task learning, Adv. Neural Inf. Process. Syst, vol.23, pp.964-972, 2010.

A. Quattoni, X. Carreras, M. Collins, and T. Darrell, An efficient projection for 1,? regularization, Int. Conf, 2009.
DOI : 10.1145/1553374.1553484

URL : http://dspace.mit.edu/bitstream/1721.1/59367/1/Collins_An%20efficient.pdf

L. Wang, X. Shen, and Y. F. Zheng, On L1-norm multi-class support vector machines, ICMLA, pp.14-16

J. Langford, L. Li, and T. Zhang, Sparse online learning via truncated gradient, J. Mach. Learn, vol.10, pp.777-801, 2009.

R. I. Bot¸, A. Bot¸, G. Heinrich, K. Wanka, K. Seki et al., Employing different loss functions for the classification of images via supervised learning 2013, preprint, www.mat.univie.ac.at/ rabot Block coordinate descent algorithms for large-scale sparse multiclass classification, J. Mach. Learn, vol.93, issue.1, pp.31-52, 2013.

K. Koh, S. Kim, and S. Boyd, An interior-point method for large-scale 1-regularized logistic regression, J. Mach. Learn. Res, vol.8, pp.1519-1555, 2007.

B. Krishnapuram, L. Carin, M. A. Figueiredo, and A. J. Hartemink, Sparse multinomial logistic regression: fast algorithms and generalization bounds, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, issue.6, 2005.
DOI : 10.1109/TPAMI.2005.127

URL : http://www.cs.duke.edu/~amink/publications/manuscripts/hartemink05.pami.pdf

H. H. Bauschke and P. L. , Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2011.
DOI : 10.1007/978-3-319-48311-5

C. Chaux, P. L. Combettes, J. Pesquet, and V. R. Wajs, A variational formulation for frame-based inverse problems, Inverse Problems, vol.23, issue.4, 2007.
DOI : 10.1088/0266-5611/23/4/008

URL : https://hal.archives-ouvertes.fr/hal-00621883

P. L. Combettes and J. Pesquet, A proximal decomposition method for solving convex variational inverse problems, Inverse Problems, vol.24, issue.6, 2008.
DOI : 10.1088/0266-5611/24/6/065014

URL : https://hal.archives-ouvertes.fr/hal-00692901

J. M. Fadili and G. Peyré, Total Variation Projection With First Order Schemes, IEEE Transactions on Image Processing, vol.20, issue.3, pp.657-669, 2011.
DOI : 10.1109/TIP.2010.2072512

URL : https://hal.archives-ouvertes.fr/hal-00380491

B. Taskar, C. Guestrin, and D. Koller, Max-margin Markov networks, Advances in Neural Information Processing Systems, pp.25-32, 2004.

G. Chierchia, N. Pustelnik, J. Pesquet, and B. Pesquet-popescu, Epigraphical splitting for solving constrained convex formulations of inverse problems with proximal tools, 2013.
DOI : 10.1007/s11760-014-0664-1

URL : http://hal-enpc.archives-ouvertes.fr/docs/00/75/23/63/PDF/hal.pdf

J. Pesquet and N. Pustelnik, A parallel inertial proximal optimization method, Pac. J. Optim, vol.8, issue.2, pp.273-305, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00790702

A. Chambolle and T. Pock, A First-Order Primal-Dual Algorithm for Convex Problems with??Applications to Imaging, Journal of Mathematical Imaging and Vision, vol.60, issue.5, 2011.
DOI : 10.1007/978-3-540-74936-3_22

URL : https://hal.archives-ouvertes.fr/hal-00490826

P. L. Combettes and J. Pesquet, Primal-dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian , and parallel-sum type monotone operators Set-Valued Var, Anal, vol.20, issue.2, pp.307-330, 2012.
DOI : 10.1007/s11228-011-0191-y

URL : http://www.ann.jussieu.fr/%7Eplc/svva2.pdf

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. of IEEE, pp.2278-2324, 1998.
DOI : 10.1109/5.726791

URL : http://www.cs.berkeley.edu/~daf/appsem/Handwriting/papers/00726791.pdf

J. Bruna and S. Mallat, Invariant Scattering Convolution Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1872-1886, 2013.
DOI : 10.1109/TPAMI.2012.230

URL : http://arxiv.org/pdf/1203.1513