F. R. Bach, Consistency of trace norm minimization, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00179522

F. R. Bach, G. R. Lanckriet, and M. I. Jordan, Multiple kernel learning, conic duality, and the SMO algorithm, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015424

F. R. Bach, R. Thibaux, and M. I. Jordan, Computing regularization paths for learning multiple kernels, Advances in Neural Information Processing Systems 17, 2004.

C. Baker, Joint measures and cross-covariance operators. Transactions of the, pp.273-289, 1973.
DOI : 10.1090/s0002-9947-1973-0336795-3

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

A. Berlinet and C. Thomas-agnan, Reproducing Kernel Hilbert Spaces in Probability and Statistics, 2003.
DOI : 10.1007/978-1-4419-9096-9

O. Bousquet and D. J. Herrmann, On the complexity of learning the kernel matrix, Advances in Neural Information Processing Systems 17, 2003.

S. Boyd and L. Vandenberghe, Convex Optimization, 2003.

P. Brémaud, Markov chains, Gibbs fields, Monte Carlo simulation, and queues, 1999.

H. Brezis, Analyse Fonctionelle, 1980.

A. Caponnetto and E. De-vito, Fast rates for regularized least-squares algorithm, 2005.

F. Cucker and S. Smale, On the mathematical foundations of learning, Bulletin of the American Mathematical Society, vol.39, issue.01, 2002.
DOI : 10.1090/S0273-0979-01-00923-5

R. Durrett, Probability: theory and examples, 2004.
DOI : 10.1017/CBO9780511779398

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, Least angle regression, Annals of Statistics, vol.32, p.407, 2004.

W. Fu and K. Knight, Asymptotics for lasso-type estimators, The Annals of Statistics, vol.28, issue.5, pp.1356-1378, 2000.
DOI : 10.1214/aos/1015957397

K. Fukumizu, F. R. Bach, and M. I. Jordan, Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces, Journal of Machine Learning Research, vol.5, pp.73-99, 2004.

K. Fukumizu, F. R. Bach, and A. Gretton, Statistical convergence of kernel canonical correlation analysis, Journal of Machine Learning Research, vol.8, issue.8, 2007.

A. Gretton, R. Herbrich, A. Smola, O. Bousquet, and B. Schölkopf, Kernel methods for measuring independence, Journal of Machine Learning Research, vol.6, issue.12, pp.2075-2129, 2005.

Z. Harchaoui and F. R. Bach, Image Classification with Segmentation Graph Kernels, 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007.
DOI : 10.1109/CVPR.2007.383049

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2001.

T. J. Hastie and R. J. Tibshirani, Generalized Additive Models, 1990.

A. Juditsky and A. Nemirovski, Functional aggregation for nonparametric regression, Annals of Statistics, vol.28, issue.3, pp.681-712, 2000.

G. R. Lanckriet, T. De-bie, N. Cristianini, M. I. Jordan, and W. S. Noble, A statistical framework for genomic data fusion, Bioinformatics, vol.20, issue.16, pp.2626-2635, 2004.
DOI : 10.1093/bioinformatics/bth294

G. R. Lanckriet, N. Cristianini, L. Ghaoui, P. Bartlett, and M. I. Jordan, Learning the kernel matrix with semidefinite programming, Journal of Machine Learning Research, vol.5, pp.27-72, 2004.

M. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lébret, Applications of second-order cone programming, Linear Algebra and its Applications, vol.284, issue.1-3, pp.193-228, 1998.
DOI : 10.1016/S0024-3795(98)10032-0

J. Mcauley, J. Ming, D. Stewart, and P. Hanna, Subband correlation and robust speech recognition, IEEE Transactions on Speech and Audio Processing, vol.13, issue.5, pp.956-964, 2005.
DOI : 10.1109/TSA.2005.851952

L. Meier, S. Van-de-geer, and P. Bühlmann, The group lasso for logistic regression, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.68, issue.1, 2006.
DOI : 10.1111/j.1467-9868.2007.00627.x

N. Meinshausen and B. Yu, Lasso-type recovery of sparse representations for high-dimensional data, The Annals of Statistics, vol.37, issue.1, 2006.
DOI : 10.1214/07-AOS582

M. R. Osborne, B. Presnell, and B. A. Turlach, On the lasso and its dual, Journal of Computational and Graphical Statistics, vol.9, issue.2, pp.319-337, 2000.

A. Rakotomamonjy, F. R. Bach, S. Canu, and Y. Grandvalet, More efficiency in multiple kernel learning, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273594

P. Ravikumar, H. Liu, J. Lafferty, and L. Wasserman, SpAM: Sparse additive models, Advances in Neural Information Processing Systems 22, 2008.
DOI : 10.1111/j.1467-9868.2009.00718.x

A. Renyi, On measures of dependence, Acta Mathematica Academiae Scientiarum Hungaricae, vol.1, issue.3-4, pp.441-451, 1959.
DOI : 10.1007/BF02024507

B. Schölkopf and A. J. Smola, Learning with Kernels, 2001.

S. Sonnenburg, G. Rätsch, C. Schäfer, and B. Schölkopf, Large scale multiple kernel learning, Journal of Machine Learning Research, vol.7, pp.1531-1565, 2006.

I. Steinwart, On the influence of the kernel on the consistency of support vector machines, Journal of Machine Learning Research, vol.2, pp.67-93, 2001.

R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of The Royal Statistical Society Series B, vol.58, issue.1, pp.267-288, 1994.

A. N. Tikhonov and V. Y. Arsenin, Solutions of ill-posed problems, 1997.

A. W. Van and . Vaart, Asymptotic Statistics, 1998.

M. Varma and D. Ray, Learning The Discriminative Power-Invariance Trade-Off, 2007 IEEE 11th International Conference on Computer Vision, 2007.
DOI : 10.1109/ICCV.2007.4408875

G. Wahba, Spline Models for Observational Data, SIAM, 1990.
DOI : 10.1137/1.9781611970128

M. J. Wainwright, Sharp thresholds for noisy and high-dimensional recovery of sparsity using ? 1 constrained quadratic programming, 2006.

Q. Wu, Y. Ying, and D. Zhou, Multi-kernel regularized classifiers, Journal of Complexity, vol.23, issue.1, pp.108-134, 2007.
DOI : 10.1016/j.jco.2006.06.007

URL : http://doi.org/10.1016/j.jco.2006.06.007

M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.58, issue.1, pp.49-67, 2006.
DOI : 10.1198/016214502753479356

M. Yuan and Y. Lin, On the non-negative garrotte estimator, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.101, issue.2, pp.143-161, 2007.
DOI : 10.1111/j.1467-9868.2005.00503.x

P. Zhao and B. Yu, On model selection consistency of Lasso, Journal of Machine Learning Research, vol.7, pp.2541-2563, 2006.

D. Zhou and C. J. Burges, Spectral clustering and transductive learning with multiple views, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273642

H. Zhu, C. K. Williams, R. Rohwer, and M. Morciniec, Gaussian regression and optimal finite dimensional linear models, Neural Networks and Machine Learning, 1998.

H. Zou, The Adaptive Lasso and Its Oracle Properties, Journal of the American Statistical Association, vol.101, issue.476, pp.1418-1429, 2006.
DOI : 10.1198/016214506000000735