Y. Cho and L. K. Saul, Kernel methods for deep learning, Advances in Neural Information Processing Systems 22, pp.342-350, 2009.

B. Schölkopf and A. J. Smola, Learning with kernels: Support vector machines, regularization , optimization, and beyond, 2002.

G. E. Hinton and R. R. Salakhutdinov, Reducing the Dimensionality of Data with Neural Networks, Science, vol.313, issue.5786, p.313504, 2006.
DOI : 10.1126/science.1127647

Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, Greedy layer-wise training of deep networks, Advances in Neural Information Processing Systems 19, pp.153-160, 2007.

P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol, Extracting and composing robust features with denoising autoencoders, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.1096-1103, 2008.
DOI : 10.1145/1390156.1390294

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.141.2238

R. Lengellé and T. Denoeux, Training MLPs layer by layer using an objective function for internal representations, Neural Networks, vol.9, issue.1, pp.83-97, 1996.
DOI : 10.1016/0893-6080(95)00096-8

R. Rosipal, L. J. Trejo, and B. Matthews, Kernel PLS-SVC for linear and nonlinear classification, Proceedings of the 20th International conference on Machine learning, pp.640-648, 2003.

K. Weinberger, J. Blitzer, and L. K. Saul, Distance metric learning for large margin nearest neighbor classification, Advances in Neural Information Processing Systems 18, pp.1473-1480, 2006.

J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, Supervised dictionary learning, Advances in Neural Information Processing Systems 21, pp.1033-1040, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00322431