G. Alain and Y. Bengio, What Regularized Auto- Encoders Learn from the Data Generating Distribution, 2013.

C. G. Atkeson and S. Schaal, Memory-based neural networks for robot learning, Neurocomputing, vol.9, issue.3, pp.243-269, 1995.
DOI : 10.1016/0925-2312(95)00033-6

M. F. Augusteijn and T. P. Harrington, Evolving transfer functions for artificial neural networks, Neural Computing & Applications, vol.13, issue.1, pp.38-46, 2004.
DOI : 10.1007/s00521-003-0393-9

Y. Bengio, Learning Deep Architectures for AI, Foundations and Trends?? in Machine Learning, vol.2, issue.1, pp.1-127, 2009.
DOI : 10.1561/2200000006

Y. Bengio, A. Courville, and P. Vincent, Representation Learning: A Review and New Perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1798-1828, 2013.
DOI : 10.1109/TPAMI.2013.50

C. M. Bishop, Neural Networks for Pattern Recognition, 1995.

C. M. Bishop, Model-based machine learning, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol.28, issue.25, pp.1-17, 2013.
DOI : 10.1002/sim.3680

C. M. Bishop, Pattern recognition and machine learning, 2006.

P. Bloomfield and W. Steiger, Least Absolute Deviations Curve-Fitting, SIAM Journal on Scientific and Statistical Computing, vol.1, issue.2, pp.290-301, 1980.
DOI : 10.1137/0901019

L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees, 1984.

M. V. Butz and O. Herbort, Context-dependent predictions and cognitive arm control with XCSF, Proceedings of the 10th annual conference on Genetic and evolutionary computation, GECCO '08, pp.1357-1364, 2008.
DOI : 10.1145/1389095.1389360

M. V. Butz, G. K. Pedersen, and P. O. Stalph, Learning sensorimotor control structures with XCSF, Proceedings of the 11th Annual conference on Genetic and evolutionary computation, GECCO '09, pp.1171-1178, 2009.
DOI : 10.1145/1569901.1570059

S. Calinon, Robot programming by demonstration, 2009.

S. Calinon, F. Guenter, and A. Billard, On Learning, Representing, and Generalizing a Task in a Humanoid Robot, IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), vol.37, issue.2, pp.286-298, 2007.
DOI : 10.1109/TSMCB.2006.886952

T. Cederborg, M. Li, A. Baranes, and P. Oudeyer, Incremental local online Gaussian Mixture Regression for imitation learning of multiple tasks, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.267-274, 2010.
DOI : 10.1109/IROS.2010.5652040
URL : https://hal.archives-ouvertes.fr/inria-00541778

S. Cohen and N. Intrator, A Hybrid Projection-based and Radial Basis Function Architecture: Initial Values and Global Optimisation, Pattern Analysis & Applications, vol.5, issue.2, pp.113-120, 2002.
DOI : 10.1007/s100440200010

Y. N. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli et al., Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, Advances in Neural Information Processing Systems, pp.2933-2941, 2014.

A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B, pp.1-38, 1977.

G. Dorffner, UNIFIED FRAMEWORK FOR MLPs AND RBFNs: INTRODUCING CONIC SECTION FUNCTION NETWORKS, Cybernetics and Systems, vol.1, issue.4, pp.511-554, 1994.
DOI : 10.1016/0893-6080(91)90072-D

A. Droniou, S. Ivaldi, V. Padois, and O. Sigaud, Autonomous online learning of velocity kinematics on the iCub: A comparative study, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.3577-3582, 2012.
DOI : 10.1109/IROS.2012.6385674
URL : https://hal.archives-ouvertes.fr/hal-00719964

A. Droniou, S. Ivaldi, P. Stalph, M. Butz, and O. Sigaud, Learning velocity kinematics: Experimental comparison of on-line regression algorithms, Proceedings Robotica, pp.15-20, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00719975

M. Ebden, Gaussian processes for regression: A quick introduction, 2008.

R. Fisher, Statistical methods for research workers, 1925.

J. H. Friedman and W. Stuetzle, Projection Pursuit Regression, Journal of the American Statistical Association, vol.4, issue.376, pp.817-823, 1981.
DOI : 10.1080/01621459.1981.10477729

P. Geladi and B. Kowalski, Partial least-squares regression: a tutorial, Analytica Chimica Acta, vol.185, pp.1-17, 1986.
DOI : 10.1016/0003-2670(86)80028-9

Z. Ghahramani and M. I. Jordan, Supervised learning from incomplete data via an EM approach, Advances in Neural Information Processing Systems 6, pp.120-127, 1993.

A. Gijsberts and G. Metta, Incremental learning of robot dynamics using random features, 2011 IEEE International Conference on Robotics and Automation, pp.951-956, 2011.
DOI : 10.1109/ICRA.2011.5980191

A. Gijsberts and G. Metta, Real-time model learning using Incremental Sparse Spectrum Gaussian Process Regression, Neural Networks, vol.41, 2012.
DOI : 10.1016/j.neunet.2012.08.011

D. Grollman and O. C. Jenkins, Sparse incremental learning for interactive robot control policy estimation, 2008 IEEE International Conference on Robotics and Automation, pp.3315-3320, 2008.
DOI : 10.1109/ROBOT.2008.4543716

M. Hersch, F. Guenter, S. Calinon, and A. Billard, Dynamical System Modulation for Robot Learning via Kinesthetic Demonstrations, IEEE Transactions on Robotics, vol.24, issue.6, pp.1463-1467, 2008.
DOI : 10.1109/TRO.2008.2006703

S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, Gradient flow in recurrent nets: the difficulty of learning long-term dependencies A Field Guide to Dynamical Recurrent Neural Networks, 2001.

G. Huang and L. Chen, Enhanced random search based incremental extreme learning machine, Neurocomputing, vol.71, issue.16-18, pp.16-18, 2008.
DOI : 10.1016/j.neucom.2007.10.008

G. Huang, L. Chen, and C. K. Siew, Universal Approximation Using Incremental Constructive Feedforward Networks With Random Hidden Nodes, IEEE Transactions on Neural Networks, vol.17, issue.4, pp.879-892, 2006.
DOI : 10.1109/TNN.2006.875977

G. Huang, M. Li, L. Chen, and C. K. Siew, Incremental extreme learning machine with fully complex hidden nodes, Neurocomputing, vol.71, issue.4-6, pp.4-6, 2008.
DOI : 10.1016/j.neucom.2007.07.025

G. Huang, D. H. Wang, and Y. Lan, Extreme learning machines: a survey, International Journal of Machine Learning and Cybernetics, vol.23, issue.3, pp.107-122, 2011.
DOI : 10.1007/s13042-011-0019-y

G. Huang, Q. Zhu, and C. Siew, Extreme learning machine: Theory and applications, Neurocomputing, vol.70, issue.1-3, pp.489-501, 2006.
DOI : 10.1016/j.neucom.2005.12.126

A. J. Ijspeert, J. Nakanishi, H. Hoffmann, P. Pastor, and S. Schaal, Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors, Neural Computation, vol.2010, issue.11, pp.328-373, 2013.
DOI : 10.1109/AT-EQUAL.2009.32

A. C. Lammert, L. Goldstein, and K. Iskarous, Locallyweighted regression for estimating the forward kinematics of a geometric vocal tract model, pp.1604-1607, 2010.

D. Marin, J. Decock, L. Rigoux, and O. Sigaud, Learning cost-efficient control policies with XCSF, Proceedings of the 13th annual conference on Genetic and evolutionary computation, GECCO '11, pp.1235-1242, 2011.
DOI : 10.1145/2001576.2001743
URL : https://hal.archives-ouvertes.fr/hal-00703760

F. Meier, P. Hennig, and S. Schaal, Local gaussian regression, 2014.

T. Munzer, F. Stulp, and O. Sigaud, Non-linear regression algorithms for motor skill acquisition: a comparison, Proceedings JFPDA, pp.1-16, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01090848

R. M. Neal, Bayesian Learning for Neural Networks, 1996.
DOI : 10.1007/978-1-4612-0745-0

D. Nguyen-tuong, M. Seeger, and J. Peters, Model Learning with Local Gaussian Process Regression, Advanced Robotics, vol.23, issue.15, pp.2015-2034, 2009.
DOI : 10.1163/016918609X12529286896877

M. J. Orr, J. Hallam, K. Takezawa, A. F. Murray, S. Ninomiya et al., COMBINING REGRESSION TREES AND RADIAL BASIS FUNCTION NETWORKS, International Journal of Neural Systems, vol.10, issue.06, pp.453-465, 2000.
DOI : 10.1142/S0129065700000363

J. Park and I. W. Sandberg, Approximation and Radial-Basis-Function Networks, Neural Computation, vol.2, issue.2, pp.305-316, 1993.
DOI : 10.1162/neco.1991.3.2.246

R. Pascanu, G. Montúfar, and Y. Bengio, On the number of inference regions of deep feed forward networks with piece-wise linear activations. arXiv preprint, 2013.

R. L. Plackett, SOME THEOREMS IN LEAST SQUARES, Biometrika, vol.37, issue.1-2, pp.149-157, 1950.
DOI : 10.1093/biomet/37.1-2.149

T. Poggio and F. Girosi, Networks for approximation and learning, Proceedings of the IEEE, vol.78, issue.9, 1990.
DOI : 10.1109/5.58326

Q. Candela, J. Rasmussen, and C. E. , A unifying view of sparse approximate Gaussian process regression, Journal of Machine Learning Research, vol.6, pp.1939-1959, 2005.

R. J. Quinlan, Learning with continuous classes, 5th Australian Joint Conference on Artificial Intelligence, pp.343-348, 1992.

A. Rahimi and B. Recht, Random features for large-scale kernel machines In: Advances in neural information processing systems, pp.1177-1184, 2007.

A. Rahimi and B. Recht, Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning, Advances in Neural Information Processing Systems 21, pp.1313-1320, 2008.

F. Rosenblatt, The perceptron: A probabilistic model for information storage and organization in the brain., Psychological Review, vol.65, issue.6, pp.386-408, 1958.
DOI : 10.1037/h0042519

C. Saunders, A. Gammerman, and V. Vovk, Ridge regression learning algorithm in dual variables, (ICML- 1998) Proceedings of the 15th International Conference on Machine Learning, pp.515-521, 1998.

S. Schaal and C. G. Atkeson, Receptive field weighted regression, 1997.

J. Schmidhuber, Deep learning in neural networks: An overview. arXiv preprint arXiv, pp.1404-7828, 2014.

M. Schmidt, Least squares optimization with l1-norm regularization, 2005.

M. Schmitt, On the Complexity of Computing and Learning with Multiplicative Neural Networks, Neural Computation, vol.15, issue.2, pp.241-301, 2002.
DOI : 10.1088/0305-4470/31/38/012

B. Schölkopf, A. Smola, R. Williamson, and P. Bartlett, New Support Vector Algorithms, Neural Computation, vol.20, issue.5, pp.1207-1245, 2000.
DOI : 10.1016/S0893-6080(98)00032-X

F. Schwenker, H. A. Kestler, and G. Palm, Three learning phases for radial-basis-function networks, Neural Networks, vol.14, issue.4-5, pp.4-5, 2001.
DOI : 10.1016/S0893-6080(01)00027-2

F. Schwenker, H. A. Kestler, and G. Palm, Three learning phases for radial-basis-function networks, Neural Networks, vol.14, issue.4-5, pp.439-458, 2001.
DOI : 10.1016/S0893-6080(01)00027-2

O. Sigaud, C. Salaün, and V. Padois, On-line regression algorithms for learning mechanical models of robots: A survey, Robotics and Autonomous Systems, vol.59, issue.12, pp.1117-1125, 2011.
DOI : 10.1016/j.robot.2011.07.006
URL : https://hal.archives-ouvertes.fr/hal-00629133

A. J. Smola and B. Schölkopf, A tutorial on support vector regression, Statistics and Computing, vol.14, issue.3, pp.199-222, 2004.
DOI : 10.1023/B:STCO.0000035301.49549.88

F. Stulp and M. Beetz, Refining the execution of abstract actions with learned action models, Journal of Artificial Intelligence Research (JAIR), vol.32, pp.487-523, 2008.

F. Stulp, E. Theodorou, and S. Schaal, Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation, IEEE Transactions on Robotics, vol.28, issue.6, pp.1360-1370, 2012.
DOI : 10.1109/TRO.2012.2210294
URL : https://hal.archives-ouvertes.fr/hal-00766177

R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society (Series B), vol.58, pp.267-288, 1996.

V. Vapnik, The Nature of Statistical Learning Theory, 1995.

S. Vijayakumar and S. Schaal, Locally weighted projection regression: An O(n) algorithm for incremental real time learning in high dimensional space, Proceedings of the Seventeenth International Conference on Machine Learning, pp.288-293, 2000.

Y. Wang and I. H. Witten, Induction of model trees for predicting continuous classes, Poster papers of the 9th European Conference on Machine Learning, 1997.

P. J. Werbos, Beyond regression: New tools for prediction and analysis in the behavioral sciences, 1974.

P. J. Werbos, Generalization of backpropagation with application to a recurrent gas market model, Neural Networks, vol.1, issue.4, pp.339-356, 1988.
DOI : 10.1016/0893-6080(88)90007-X

B. Widrow, A. Greenblatt, Y. Kim, and D. Park, The No-Prop algorithm: A new learning algorithm for multilayer neural networks, Neural Networks, vol.37, pp.182-188, 2013.
DOI : 10.1016/j.neunet.2012.09.020

C. K. Williams and C. E. Rasmussen, Gaussian processes for machine learning, 2006.

C. K. Williams, Computation with Infinite Neural Networks, Neural Computation, vol.2, issue.5, pp.1203-1216, 1998.
DOI : 10.1109/5.58326

D. R. Wilson and T. R. Martinez, The general inefficiency of batch training for gradient descent learning, Neural Networks, vol.16, issue.10, pp.1429-1451, 2003.
DOI : 10.1016/S0893-6080(03)00138-2