B. Schölkopf and A. J. Smola, Learning with Kernels, 2002.

J. Shawe-taylor and N. Cristianini, Kernel Methods for Pattern Analysis, Camb. U. P, 2004.
DOI : 10.1017/CBO9780511809682

P. Zhao and B. Yu, On model selection consistency of Lasso, J. Mach. Learn. Res, vol.7, pp.2541-2563, 2006.

F. R. Bach, Consistency of the group Lasso and multiple kernel learning, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00164735

F. R. Bach, G. R. Lanckriet, and M. I. Jordan, Multiple kernel learning, conic duality, and the SMO algorithm, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015424

P. Zhao, G. Rocha, and B. Yu, Grouped and hierarchical model selection through composite absolute penalties, Annals of Statistics, 2008.

M. Szafranski, Y. Grandvalet, and A. Rakotomamonjy, Composite kernel learning, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.1040-1047, 2008.
DOI : 10.1145/1390156.1390287

URL : https://hal.archives-ouvertes.fr/hal-00316016

C. K. Williams and M. Seeger, The effect of the input density distribution on kernel-based classifiers, Proc. ICML, 2000.

A. Rakotomamonjy, F. R. Bach, S. Canu, and Y. Grandvalet, More efficiency in multiple kernel learning, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273594

M. Pontil and C. A. Micchelli, Learning the kernel function via regularization, J. Mach. Learn. Res, vol.6, pp.1099-1125, 2005.

H. Lee, A. Battle, R. Raina, and A. Ng, Efficient sparse coding algorithms, NIPS, 2007.

S. Boyd and L. Vandenberghe, Convex Optimization, 2003.

K. Bennett, M. Momma, and J. Embrechts, MARK, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '02, 2002.
DOI : 10.1145/775047.775051

V. Roth, The Generalized LASSO, IEEE Transactions on Neural Networks, vol.15, issue.1, 2004.
DOI : 10.1109/TNN.2003.809398

K. Grauman and T. Darrell, The pyramid match kernel: Efficient learning with sets of features, J. Mach. Learn. Res, vol.8, pp.725-760, 2007.

F. R. Bach, R. Thibaux, and M. I. Jordan, Computing regularization paths for learning multiple kernels, Adv. NIPS 17, 2004.

S. Sonnenburg, G. Rätsch, C. Schäfer, and B. Schölkopf, Large scale multiple kernel learning, J. Mach. Learn. Res, vol.7, pp.1531-1565, 2006.

H. Zou, The Adaptive Lasso and Its Oracle Properties, Journal of the American Statistical Association, vol.101, issue.476, pp.1418-1429, 2006.
DOI : 10.1198/016214506000000735

W. Fu and K. Knight, Asymptotics for lasso-type estimators, The Annals of Statistics, vol.28, issue.5, pp.1356-1378, 2000.
DOI : 10.1214/aos/1015957397

M. Yuan and Y. Lin, On the non-negative garrotte estimator, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.101, issue.2, pp.143-161, 2007.
DOI : 10.1111/j.1467-9868.2005.00503.x