A. Antoniadis and J. Fan, Regularization of Wavelet Approximations, Journal of the American Statistical Association, vol.96, issue.455, pp.939-967, 2001.
DOI : 10.1198/016214501753208942

A. Argyriou, T. Evgeniou, and M. Pontil, Convex multi-task feature learning, Machine Learning, 2008.
DOI : 10.1007/s10994-007-5040-8
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.130.2025

J. F. Bonnans and A. Shapiro, Optimization problems with pertubation : A guided tour, SIAM Review, vol.40, issue.2, pp.202-227, 1998.
DOI : 10.1137/s0036144596302644
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.5528

J. F. Bonnans, J. Gilbert, C. Lemaréchal, and C. A. Sagastizbal, Numerical Optimization Theoretical and Practical Aspects, 2003.

S. Boyd and L. Vandenberghe, Convex Optimization, 2004.

S. Canu, Y. Grandvalet, V. Guigue, and A. Rakotomamonjy, SVM and kernel methods Matlab toolbox. LITIS EA4108, INSA de Rouen, 2003.

C. Chang and C. Lin, LIBSVM, ACM Transactions on Intelligent Systems and Technology, vol.2, issue.3, 2001.
DOI : 10.1145/1961189.1961199

O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukerjhee, Choosing multiple parameters for SVM, Machine Learning, pp.131-159, 2002.

K. Crammer and Y. Singer, On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines, Journal of Machine Learning Research, vol.2, pp.265-292, 2001.

A. D. Amato, A. Antoniadis, and M. Pensky, Wavelet kernel penalized estimation for non-equispaced design regression, Statistics and Computing, vol.23, issue.1, pp.37-56, 2006.
DOI : 10.1007/s11222-006-5283-4
URL : https://hal.archives-ouvertes.fr/hal-00103268

A. D. Aspremont, Smooth Optimization with Approximate Gradient, SIAM Journal on Optimization, 2008.

D. Decoste and K. Wagstaff, Alpha seeding for support vector machines, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '00, 2000.
DOI : 10.1145/347090.347165

K. Duan and S. Keerthi, Which Is the Best Multiclass SVM Method? An Empirical Study, Multiple Classifier Systems, pp.278-285, 2005.
DOI : 10.1007/11494683_28

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, Least angle regression (with discussion ) Annals of statistics, pp.407-499, 2004.

G. Fung, M. Dundar, J. Bi, and B. Rao, A fast iterative algorithm for fisher discriminant using heterogeneous kernels, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015409
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.2290

Y. Grandvalet, Least Absolute Shrinkage is Equivalent to Quadratic Penalization, of Perspectives in Neural Computing, pp.201-206, 1998.
DOI : 10.1007/978-1-4471-1599-1_27
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.164.3797

Y. Grandvalet and S. Canu, Adaptive scaling for feature selection in svms, Advances in Neural Information Processing Systems, 2003.

Y. Grandvalet and S. Canu, Outcomes of the equivalence of adaptive ridge with least absolute shrinkage, 1999.

Z. Harchaoui and F. Bach, Image Classification with Segmentation Graph Kernels, 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007.
DOI : 10.1109/CVPR.2007.383049
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.107.2278

T. Hastie, S. Rosset, R. Tibshirani, and J. Zhu, The entire regularization path for the support vector machine, Journal of Machine Learning Research, vol.5, pp.1391-1415, 2004.

C. Hsu and C. Lin, A comparison of methods for multi-class support vector machines, IEEE Transactions on Neural Networks, vol.13, pp.415-425, 2002.

T. Joachims, Making large-scale SVM learning practical, Advanced in Kernel Methods -Support Vector Learning, pp.169-184, 1999.

S. Kim, A. Magnani, and S. Boyd, Optimal kernel selection in Kernel Fisher discriminant analysis, Proceedings of the 23rd international conference on Machine learning , ICML '06, 2006.
DOI : 10.1145/1143844.1143903
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.4640

K. Koh, S. Kim, and S. Boyd, An interior-point method for large-scale ? 1 -regularized logistic regression, Journal of Machine Learning Research, vol.8, pp.1519-1555, 2007.

G. Lanckriet, T. De-bie, N. Cristianini, M. Jordan, and W. Noble, A statistical framework for genomic data fusion, Bioinformatics, vol.20, issue.16, pp.2626-2635, 2004.
DOI : 10.1093/bioinformatics/bth294
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.333.2330

G. Lanckriet, N. Cristianini, L. Ghaoui, P. Bartlett, and M. Jordan, Learning the kernel matrix with semi-definite programming, Journal of Machine Learning Research, vol.5, pp.27-72, 2004.

C. Lemaréchal and C. Sagastizabal, Practical Aspects of the Moreau--Yosida Regularization: Theoretical Preliminaries, SIAM Journal on Optimization, vol.7, issue.2, pp.867-895, 1997.
DOI : 10.1137/S1052623494267127

G. Loosli and S. Canu, Comments on the Core Vector Machines: Fast SVM Training on Very Large Data Sets, Journal of Machine Learning Research, vol.8, pp.291-301, 2007.

G. Loosli, S. Canu, S. Vishwanathan, A. Smola, and M. Chattopadhyay, Bo??tèBo??tè a outils SVM simple et rapide. Revue d, Intelligence Artificielle, vol.19, pp.4-5741, 2005.
DOI : 10.3166/ria.19.741-767

D. Luenberger, Linear and Nonlinear Programming, 1984.
DOI : 10.1007/978-3-319-18842-3

C. Micchelli and M. Pontil, Learning the kernel function via regularization, Journal of Machine Learning Research, vol.6, pp.1099-1125, 2005.

A. Rakotomamonjy and S. Canu, Frames, reproducing kernels, regularization and learning, Journal of Machine Learning Research, vol.6, pp.1485-1515, 2005.

A. Rakotomamonjy, X. Mary, and S. Canu, Non parametric regression with wavelet kernels Applied Stochastics Model for Business and Industry, pp.153-163, 2005.
DOI : 10.1002/asmb.533

A. Rakotomamonjy, F. Bach, S. Canu, and Y. Grandvalet, More efficiency in multiple kernel learning, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.775-782, 2007.
DOI : 10.1145/1273496.1273594
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.121.8149

R. Rifkin and A. Klautau, In Defense of One-Vs-All Classification, Journal of Machine Learning Research, vol.5, pp.101-141, 2004.

S. Rosset, Tracking Curved Regularized Optimization Solution Paths, Advances in Neural Information Processing Systems, 2004.
DOI : 10.1214/009053606000001370
URL : http://arxiv.org/abs/0708.2197

B. Schölkopf and A. Smola, Learning with Kernels, 2001.

S. Sonnenburg, G. Rätsch, and C. Schäfer, A general and efficient algorithm for multiple kernel learning, Advances in Neural Information Processing Systems, pp.1-8, 2005.

S. Sonnenburg, G. Rätsch, C. Schäfer, and B. Schölkopf, Large scale multiple kernel learning, Journal of Machine Learning Research, vol.7, issue.1, pp.1531-1565, 2006.

I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun, Large Margin Methods for Structured and Interdependent Output Variables, Journal of Machine Learning Research, vol.6, pp.1453-1484, 2005.

V. Vapnik, S. Golowich, and A. Smola, Support Vector Method for function estimation, Regression estimation and Signal processing, neural information processing systems, 1997.

G. Wahba, Spline Models for Observational Data, Series in Applied Mathematics, 1990.
DOI : 10.1137/1.9781611970128

J. Weston and C. Watkins, Multiclass support vector machines, Proceedings of ESANN99, Brussels. D. Facto Press, 1999.
URL : https://hal.archives-ouvertes.fr/hal-00750277

A. Zien and C. S. Ong, Multiclass multiple kernel learning, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.1191-1198, 2007.
DOI : 10.1145/1273496.1273646
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.165.9876