I. E. Ahamada and . Flachaire, Non-Parametric Econometrics, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00630410

D. Aigner, C. A. Lovell, and P. Schmidt, Formulation and estimation of stochastic frontier production function models, Journal of Econometrics, vol.6, issue.1, pp.21-37, 1977.
DOI : 10.1016/0304-4076(77)90052-5

J. Aldrich, The Econometricians' Statisticians, 1895-1945, History of Political Economy, vol.42, issue.1, pp.111-154, 2010.
DOI : 10.1215/00182702-2009-064

E. Altman, G. Marco, and F. Varetto, Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience), Journal of Banking & Finance, vol.18, issue.3, pp.505-529, 1994.
DOI : 10.1016/0378-4266(94)90007-8

J. D. Angrist and V. Lavy, Using Maimonides' Rule to Estimate the Effect of Class Size on Scholastic Achievement, The Quarterly Journal of Economics, vol.114, issue.2, pp.533-575, 1999.
DOI : 10.1037/h0044319

J. D. Angrist and J. S. Pischke, The Credibility Revolution in Empirical Economics: How Better Research Design is Taking the Con out of Econometrics, Journal of Economic Perspectives, vol.24, issue.2, pp.3-30, 2010.
DOI : 10.1257/jep.24.2.3

J. D. Angrist and J. S. Pischke, Mastering Metrics, 2015.

J. D. Angrist and A. B. Krueger, Does Compulsory School Attendance Affect Schooling and Earnings?, The Quarterly Journal of Economics, vol.106, issue.4, pp.979-1014, 1991.
DOI : 10.2307/2937954
URL : http://www.irs.princeton.edu/pubs/pdfs/273.pdf

L. Bottou, Large-Scale Machine Learning with Stochastic Gradient Descent, Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT'2010), pp.177-187, 2010.
DOI : 10.1201/b11429-4

P. Bajari, D. Nekipelov, S. P. Ryan, and M. Yang, Machine Learning Methods for Demand Estimation, American Economic Review, vol.105, issue.5, pp.481-485, 2015.
DOI : 10.1257/aer.p20151021

S. K. Bazen and . Charni, Do earnings really decline for older workers? AMSE 2015-11 Discussion Paper, 2015.

R. E. Bellman, Dynamic programming, 1957.

A. Belloni, V. Chernozhukov, and C. Hansen, Inference Methods for High-Dimensional Sparse Econometric Models Advances in Economics and Econometrics, pp.245-295, 2010.

A. Belloni, D. Chen, V. Chernozhukov, and C. Hansen, Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain, Econometrica, vol.80, pp.2369-2429, 2012.

Y. Benjamini and Y. Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society, Series B, vol.57, pp.289-300, 1995.

J. O. Berger, Statistical decision theory and Bayesian Analysis, 1985.
DOI : 10.1007/978-1-4757-4286-2

R. A. Berk, Statistical Learning from a Regression Perspective, 2008.
DOI : 10.1007/978-3-319-44048-4

J. Berkson, Applications of the logistic function to bioassay, Journal of the American Statistical Association, vol.9, pp.357-365, 1944.

J. Berkson, Why I Prefer Logits to Probits, Biometrics, vol.7, issue.4, pp.327-339, 1951.
DOI : 10.2307/3001655

J. M. Bernardo and A. F. Smith, Bayesian Theory, 2000.
DOI : 10.1002/9780470316870

E. R. Berndt, The Practice of Econometrics: Classic and Contemporary, 1990.

P. J. Bickel, F. Gotze, and W. Van-zwet, Resampling Fewer Than n Observations: Gains, Losses, and Remedies for Losses, Statistica Sinica, vol.7, pp.1-31, 1997.
DOI : 10.1007/978-1-4614-1314-1_17

C. Bishop, Pattern Recognition and Machine Learning, 2006.

A. Blanco, M. Pino-mejias, J. Lara, and S. Rayo, Credit scoring models for the microfinance industry using neural networks: Evidence from Peru, Expert Systems with Applications, vol.40, issue.1, pp.356-364, 2013.
DOI : 10.1016/j.eswa.2012.07.051

C. I. Bliss, THE METHOD OF PROBITS, Science, vol.79, issue.2037, pp.38-39, 1934.
DOI : 10.1126/science.79.2037.38

L. Breiman, Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author), Statistical Science, vol.16, issue.3, pp.199-231, 2001.
DOI : 10.1214/ss/1009213726

P. Bühlmann and S. Van-de-geer, Statistics for high-dimensional data: methods, theory and applications, 2011.
DOI : 10.1007/978-3-642-20192-9

L. Breiman, Random forests, Machine Learning, vol.45, issue.1, pp.5-32, 2001.
DOI : 10.1023/A:1010933404324

L. D. Brown, Fundamentals of statistical exponential families: with applications in statistical decision theory, Institute of Mathematical Statistics, 1986.

P. Bühlmann and S. Van-de-geer, Statistics for High Dimensional Data: Methods, Theory and Applications, 2011.
DOI : 10.1007/978-3-642-20192-9

E. Candès and Y. Plan, Near-ideal model selection by 1 minimization. The Annals of Statistics, pp.2145-2177, 2009.

B. S. Clarke, E. Fokoué, and H. H. Zhang, Principles and Theory for Data Mining and Machine Learning, 2009.
DOI : 10.1007/978-0-387-98135-2

C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, vol.1, issue.3, pp.273-297, 1995.
DOI : 10.1007/BF00994018

G. Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals, and Systems, vol.27, issue.4, pp.303-314, 1989.
DOI : 10.1090/pspum/028.2/0507425

G. Darmois, Sur les lois de probabilites a estimation exhaustive, Comptes Rendus de l'Académie des Sciencs, pp.1265-1266, 0200.

I. Daubechies, M. Defrise, and C. De-mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Communications on Pure and Applied Mathematics, vol.58, issue.11, pp.1413-1457, 2004.
DOI : 10.1002/0471221317

A. C. Davison, Bootstrap, 1997.

R. Davidson and J. G. Mackinnon, Estimation and Inference in Econometrics., Economica, vol.62, issue.245, 1993.
DOI : 10.2307/2554780

R. Davidson and J. G. Mackinnon, Econometric Theory and Methods, 2003.

Q. Duo, The Formation of Econometrics, 1993.

G. Debreu, Theoretic Models: Mathematical Form and Economic Content, Econometrica, vol.54, issue.6, pp.1259-1270, 1986.
DOI : 10.2307/1914299

P. Dhillon, Y. Lu, D. P. Foster, and L. H. Ungar, New Subsampling Algorithms for Fast Least Squares Regression, Advances in Neural Information Processing Systems, 2014.

E. Engel, Die Productions-und Consumtionsverhältnisse des Königreichs Sachsen, 1857.

M. Feldstein and C. Horioka, Domestic Saving and International Capital Flows, The Economic Journal, vol.90, issue.358, pp.314-329, 1980.
DOI : 10.2307/2231790

P. Flach, Machine Learning, 2012.
DOI : 10.1017/CBO9780511973000

D. P. Foster and E. I. George, The Risk Inflation Criterion for Multiple Regression. The Annals of Statistics, 1947.

J. H. Friedman, Data Mining and Statistics: What's the Connection, Proceedings of the 29th Symposium on the Interface Between Computer Science and Statistics, 1997.

R. Frisch and F. V. Waugh, Partial Time Regressions as Compared with Individual Trends, Econometrica, vol.1, issue.4, pp.387-401, 1933.
DOI : 10.2307/1907330

T. Gneiting, Making and Evaluating Point Forecasts, Journal of the American Statistical Association, vol.106, issue.494, pp.746-762, 2011.
DOI : 10.1198/jasa.2011.r10138
URL : http://arxiv.org/pdf/0912.0902.pdf

P. Givord, Méthodes économétriques pour l'évaluation de politiques publiques, INSEE Document de Travail, p.8, 2010.

Y. Grandvalet, J. Mariéthoz, and S. Bengio, Interpretation of SVMs with an application to unbalanced classification, Advances in Neural Information Processing Systems 18, 2005.

T. Groves and T. Rothenberg, A note on the expected value of an inverse matrix, Biometrika, vol.56, issue.3, pp.690-691, 1969.
DOI : 10.1093/biomet/56.3.690

T. Haavelmo, The Probability Approach in Econometrics, Econometrica, vol.12, pp.1-115, 1944.
DOI : 10.2307/1906935

T. Hastie and R. Tibshirani, Generalized Additive Models, 1990.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2009.

T. Hastie, W. Tibshirani, and M. Wainwright, Statistical Learning with Sparsity, 2015.

T. Hastie, R. Tibshiriani, and R. J. Tibshiriani, Extended comparisons of best subset selection, forward stepwise selection and the Lasso, ArXiV, 2016.

X. Haultefoeuille and P. Givord, La r??gression quantile en pratique, Economie et statistique, vol.471, issue.1, pp.85-111, 2014.
DOI : 10.3406/estat.2014.10484

D. O. Hebb, The organization of behavior, 1949.

J. J. Heckman, Sample Selection Bias as a Specification Error, Econometrica, vol.47, issue.1, pp.153-161, 1979.
DOI : 10.2307/1912352

J. J. Heckman, J. L. Tobias, and E. Vytlacil, Simple Estimators for Treatment Parameters in a Latent-Variable Framework, Review of Economics and Statistics, vol.3, issue.3, pp.748-755, 2003.
DOI : 10.1111/1468-0262.00277

D. F. Hendry and H. Krolzig, Automatic Econometric Model Selection, 2001.

A. E. Hoerl, Applications of ridge analysis to regression problems, Chemical Engineering Progress, vol.58, issue.3, pp.54-59, 1962.

A. E. Hoerl and R. W. Kennard, Ridge regression: biased estimation for nonorthogonal problems This Week's Citation Classic, ISI, pp.2-9, 1981.

P. Holland, Statistics and Causal Inference, Journal of the American Statistical Association, vol.10, issue.396, pp.945-960, 1986.
DOI : 10.1016/0021-9681(59)90015-3

R. Hyndman, A. B. Koehler, J. K. Ord, and R. D. Snyder, Forecasting with Exponential Smoothing, 2009.
DOI : 10.1007/978-3-540-71918-2

G. James, D. Witten, T. Hastie, and &. R. Tibshirani, An introduction to Statistical Learning, 2013.
DOI : 10.1007/978-1-4614-7138-7

A. Khashman, Credit risk evaluation using neural networks: Emotional versus conventional models, Applied Soft Computing, vol.11, issue.8, pp.5477-5484, 2011.
DOI : 10.1016/j.asoc.2011.05.011

M. P. Kean, Structural vs. atheoretic approaches to econometrics, Journal of Econometrics, vol.156, issue.1, pp.3-20, 2010.
DOI : 10.1016/j.jeconom.2009.09.003

A. Leiner, A. Talwalkar, P. Sarkar, and M. Jordan, The Big Data Bootstrap, 2012.

I. Koch, Analysis of Multivariate and High-Dimensional Data, 2013.
DOI : 10.1017/CBO9781139025805

R. Koenker, Galton, Edgeworth, Frish, and prospects for quantile regression in Econometrics, Conference on Principles of Econometrics, 1998.

R. Koenker, Quantile Regression, 2003.

R. Koenker and J. Machado, Goodness of Fit and Related Inference Processes for Quantile Regression, Journal of the American Statistical Association, vol.11, issue.448, pp.1296-1309, 1999.
DOI : 10.1073/pnas.53.1.127

T. G. Kolda and B. W. Bader, Tensor Decompositions and Applications, SIAM Review, vol.51, issue.3, pp.455-500, 2009.
DOI : 10.1137/07070111X
URL : http://csmr.ca.sandia.gov/~tgkolda/pubs/bibtgkfiles/SAND2007-6702.pdf

T. C. Koopmans, Three Essays on the State of Economic Science, 1957.

M. Kuhn and K. Johnson, Applied Predictive Modeling, 2013.
DOI : 10.1007/978-1-4614-6849-3

J. R. Landis and G. G. Koch, The Measurement of Observer Agreement for Categorical Data, Biometrics, vol.33, issue.1, pp.159-174, 1977.
DOI : 10.2307/2529310

Y. Lecun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol.9, issue.7553, pp.436-444, 2015.
DOI : 10.1007/s10994-013-5335-x

H. Leeb, Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process, Bernoulli, vol.14, issue.3, pp.661-690, 2008.
DOI : 10.3150/08-BEJ127

T. Lemieux, The « Mincer Equation » Thirty Years After Schooling, Experience, and Earnings. in Jacob Mincer A Pioneer of Modern Labor Economics, pp.127-145, 2006.

J. J. Li and . Racine, Nonparametric Econometrics, 2006.

C. Li, Q. Li, J. Racine, and D. Zhang, Optimal Model Averaging Of Varying Coefficient Models. Department of Economics Working Papers, pp.2017-2018, 2017.

H. W. Lin, M. Tegmark, and D. Rolnick, Why does deep and cheap learning work so well? ArXiv e-prints, 2016.
DOI : 10.1007/s10955-017-1836-5
URL : http://arxiv.org/pdf/1608.08225

R. E. Lucas, Econometric policy evaluation: A critique, Carnegie-Rochester Conference Series on Public Policy, vol.1, pp.19-46, 1976.
DOI : 10.1016/S0167-2231(76)80003-6

C. L. Mallows, Some Comments on C p, Technometrics, vol.15, pp.661-675, 1973.
DOI : 10.2307/1271437

W. S. Mccullogh and W. Pitts, A logical calculus of the ideas immanent in nervous activity, The Bulletin of Mathematical Biophysics, vol.5, issue.4, pp.115-133, 1943.
DOI : 10.1007/BF02478259

J. Mincer, Schooling, experience and earnings, 1974.

T. Mitchell, Machine Learning, 1997.

J. N. Morgan and J. A. Sonquist, Problems in the Analysis of Survey Data, and a Proposal, Journal of the American Statistical Association, vol.58, issue.302, pp.415-434, 1963.
DOI : 10.1080/01621459.1963.10500855

M. S. Morgan, The history of econometric ideas, 1990.
DOI : 10.1017/CBO9780511522109

M. Mohri, A. Rostamizadeh, and A. Talwalker, Foundations of Machine Learning, 2012.

S. Mullainathan and J. Spiess, Machine Learning: An Applied Econometric Approach, Journal of Economic Perspectives, vol.31, issue.2, pp.87-106, 2017.
DOI : 10.1257/jep.31.2.87
URL : https://pubs.aeaweb.org/doi/pdfplus/10.1257/jep.31.2.87

M. Müller, Generalized Linear Models in Handbook of Computational Statistics, 2011.

K. R. Murphy, Machine Learning: a Probabilistic Perspective, 2012.

K. M. Murphy and . Welch, Empirical Age-Earnings Profiles, Journal of Labor Economics, vol.8, issue.2, pp.202-229, 1990.
DOI : 10.1086/298220

E. A. Nadaraya, On Estimating Regression. Theory of Probability and its Applications, pp.141-143, 1964.

B. K. Natarajan, Sparse Approximate Solutions to Linear Systems, SIAM Journal on Computing, vol.24, issue.2, pp.24-227, 1995.
DOI : 10.1137/S0097539792240406

A. Nevo and M. D. Whinston, Taking the Dogma out of Econometrics: Structural Modeling and Credible Inference, Journal of Economic Perspectives, vol.24, issue.2, pp.69-82, 2010.
DOI : 10.1257/jep.24.2.69

J. Neyman, Sur les applications de la théorie des probabilités aux expériences agricoles : Essai des principes, republibé dans Statistical Science, vol.5, pp.463-472, 1923.

R. Nisbet, J. Elder, and G. Miner, Handbook of Statistical Analysis and Data Mining Applications, 2011.

A. Okun, Potential GNP: Its measurement and significance, Proceedings of the Business and Economics Section of the American Statistical Association, pp.98-103, 1962.

G. H. Orcutt, Toward Partial Redirection of Econometrics, The Review of Economics and Statistics, vol.34, issue.3, pp.195-213, 1952.
DOI : 10.2307/1925626

A. A. Pagan and . Ullah, Nonparametric Econometrics. Themes in Modern Econometrics, 1999.

J. Platt, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods Advances in Large Margin Classifiers, pp.61-74, 1999.

S. Portnoy, Asymptotic Behavior of Likelihood Methods for Exponential Families when the Number of Parameters Tends to Infinity, The Annals of Statistics, vol.16, issue.1, pp.356-366, 1988.
DOI : 10.1214/aos/1176350710

M. H. Quenouille, Problems in Plane Sampling, The Annals of Mathematical Statistics, vol.20, issue.3, pp.355-375, 1949.
DOI : 10.1214/aoms/1177729989

M. H. Quenouille, Notes on Bias in Estimation, Biometrika, vol.4334, pp.353-360, 1956.

J. R. Quinlan, Induction of decision trees, Machine Learning, vol.1, issue.1, pp.81-106, 1986.
DOI : 10.1037/13135-000

O. Reiersøol, Confluence analysis of means of instrumental sets of variables, Arkiv. for Mathematik, 1945.

P. Rosenbaum and D. Rubin, The central role of the propensity score in observational studies for causal effects, Biometrika, vol.70, issue.1, pp.41-55, 1983.
DOI : 10.1093/biomet/70.1.41

F. Rosenblatt, The perceptron: A probabilistic model for information storage and organization in the brain., Psychological Review, vol.65, issue.6, pp.386-408, 1958.
DOI : 10.1037/h0042519

D. Rubin, Estimating causal effects of treatments in randomized and nonrandomized studies., Journal of Educational Psychology, vol.66, issue.5, pp.688-701, 1974.
DOI : 10.1037/h0037350

D. Ruppert, M. P. Wand, and R. J. Carroll, Semiparametric Regression, 2003.
DOI : 10.1017/CBO9780511755453

A. Samuel, Some Studies in Machine Learning Using the Game of Checkers, IBM Journal of Research and Development, vol.441, 1959.

H. Schultz, The Meaning of Statistical Demand Curves, 1930.

S. S. Shai and B. D. Shai, Understanding Machine Learning From Theory to Algorithms, 2014.

J. Shao, Linear Model Selection by Cross-validation, Journal of the American Statistical Association, vol.39, issue.422, pp.486-494, 1993.
DOI : 10.1080/03610927508827223

S. Shalev-shwartz and S. Ben-david, Understanding Machine Learning: From Theory to Algorithms, 2014.
DOI : 10.1017/CBO9781107298019

J. Shao, An Asymptotic Theory for Linear Model Selection, Statistica Sinica, vol.7, pp.221-264, 1997.

R. E. Shapire and Y. Freund, Boosting, 2012.

B. W. Silverman, Density Estimation, 1986.

J. S. Simonoff, Smoothing Methods in Statistics, 1996.
DOI : 10.1007/978-1-4612-4026-6

M. Stone, An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike's Criterion, Journal of the Royal Statistical Society. Series B, vol.39, issue.1, pp.44-47, 1977.

K. Y. Tam and M. Y. Kiang, Managerial Applications of Neural Networks: The Case of Bank Failure Predictions, Management Science, vol.38, issue.7, pp.926-947, 1992.
DOI : 10.1287/mnsc.38.7.926

H. Tan, Neural-Network model for stock forecasting, 1995.

R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, Series B, vol.58, pp.267-288, 1996.

R. Tibshirani and L. Wasserman, A Closer Look at Sparse Regression, pp.2-32, 2016.

A. N. Tikhonov, Solution of incorrectly formulated problems and the regularization method, Soviet Mathematics, vol.4, pp.1035-1038, 1963.

J. Tinbergen, Statistical Testing of Business Cycle Theories: A Method and its Application to Investment activity, Business Cycles in the United States of America, 1919?1932. Geneva: League of Nations, 1939.

J. Tobin, Estimation of Relationships for Limited Dependent Variables, Econometrica, vol.26, issue.1, pp.24-36, 1958.
DOI : 10.2307/1907382

. Tropp, Improved analysis of the subsampled randomized Hadamard transform Advances in Adaptive Data Analysis, pp.115-126, 2011.

P. Tsen, Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization, Journal of Optimization Theory and Applications, vol.109, issue.3, pp.475-494, 2001.
DOI : 10.1023/A:1017501703105

S. Tufféry, Data Mining and Statistics for Decision Making, 2001.
DOI : 10.1002/9780470979174

J. W. Tukey, Bias and confidence in not quite large samples, The Annals of Mathematical Statistics, vol.29, pp.614-623, 1958.

V. Vapnik, Statistical Learning Theory, 1998.

C. Vapnik and A. Chervonenkis, On the uniform convergence of relative frequencies of events to their probabilities, Theory of Probability and its Applications, pp.264-280, 1971.

H. R. Varian, Big Data: New Tricks for Econometrics, Journal of Economic Perspectives, vol.28, issue.2, pp.3-28, 2014.
DOI : 10.1257/jep.28.2.3

J. P. Vert, Machine learning in computational biology, ENSAE, 2017.

L. S. Waltrup, F. Sobotka, T. Kneib, and G. Kauermann, Expectile and quantile regression?David and Goliath? Statistical Modelling, pp.433-456, 2014.
DOI : 10.1177/1471082x14561155

G. S. Watson, Smooth regression analysis, Sankhya: The Indian Journal of Statistics, Series A, vol.26, issue.4, pp.359-372, 1964.

J. Watt, R. Borhani, and A. Katsaggelos, Machine Learning Refined : Foundations, Algorithms, and Applications, 2016.
DOI : 10.1017/CBO9781316402276

B. Widrow, M. E. Hoff, and . Jr, Adaptive Switching Circuits, IRE WESCON Convention Record, vol.4, pp.96-104, 1960.
DOI : 10.21236/AD0241531

D. H. Wolpert and W. G. Macready, No free lunch theorems for optimization, IEEE Transactions on Evolutionary Computation, vol.1, issue.1, p.67, 1997.
DOI : 10.1109/4235.585893
URL : http://www.cs.ubc.ca/~hutter/earg/papers07/00585893.pdf

D. Wolpert, The Lack of A Priori Distinctions Between Learning Algorithms, Neural Computation, vol.5, issue.7, pp.1341-1390, 1996.
DOI : 10.1162/neco.1993.5.6.893

E. J. Working, What Do Statistical "Demand Curves" Show?, The Quarterly Journal of Economics, vol.41, issue.2, pp.212-247, 1927.
DOI : 10.2307/1883501

K. Yu and R. Moyeed, Bayesian quantile regression, Statistics & Probability Letters, vol.54, issue.4, pp.437-447, 2001.
DOI : 10.1016/S0167-7152(01)00124-9

M. A. Zinkevich, M. Weimer, A. Smola, and L. Li, Parallelized Stochastic Gradient Advances in neural information processing systems, pp.2595-2603, 2010.