P. Q. The-variance-var, O) is a generalized Cramér-Rao lower bound for the asymptotic variance of any regular and asymptotically linear estimator of ? ? (P Q,g ) when sampling independently from P Q,g . In addition, if g = g, )(O)) = 0 implies ? ? (P Q ,g ) = ? ? (P Q,g )

. Lemma-13, 1] is pathwise differentiable at every P Q,g ? M wrt the maximal tangent space. Its efficient influence curve at P Q,g is D(Q, g) which satisfies

P. Q. The-variance-var and D. , O) is a generalized Cramér-Rao lower bound for the asymptotic variance of any regular and asymptotically linear estimator of ?(P Q,g ) when sampling independently from P Q,g . In addition, if g = g, )(O)) = 0 implies ?(P Q ,g ) = E Q Q Y (r(Q Y )(W ), W )

L. B. Balzer, M. L. Petersen, and M. J. Van-der-laan, Targeted estimation and inference for the sample average treatment effect, 2015.

E. Bolthausen, E. Perkins, A. Van, and . Vaart, Lectures on probability theory and statistics, volume 1781 of Lecture Notes in Mathematics, Lectures from the 29th Summer School on Probability Theory held in Saint-Flour, 1999.

S. Bubeck and N. Cesa-bianchi, Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Machine Learning, pp.1-122
DOI : 10.1561/2200000024

B. Chakraborty and E. E. Moodie, Statistical methods for dynamic treatment regimes Statistics for Biology and Health, 9. Reinforcement learning, causal inference, and personalized medicine, 2013.

B. Chakraborty, E. B. Laber, and Y. Zhao, Inference about the expected performance of a data-driven dynamic treatment regime, Clinical Trials, vol.7, issue.4, pp.408-417, 2014.
DOI : 10.1017/CBO9780511802843

A. Chambaz and M. J. Van-der-laan, Inference in targeted group-sequential covariateadjusted randomized clinical trials. Scand, J. Stat, vol.41, issue.1, pp.104-140, 2014.

A. Chambaz, M. J. Van-der-laan, and W. Zheng, Targeted covariate-adjusted response-adaptive lasso-based randomized controlled trials, Modern Adaptive Randomized Clinical Trials: Statistical, Operational, and Regulatory Aspects, pp.345-368, 2015.
URL : https://hal.archives-ouvertes.fr/hal-00990761

V. H. De-la-peña, E. Giné, and . Decoupling, Probability and its Applications, 1. From dependence to independence, Randomly stopped processes. U -statistics and processes. Martingales and beyond, 1999.

J. Friedman, T. Hastie, and R. Tibshirani, Regularization Paths for Generalized Linear Models via Coordinate Descent, Journal of Statistical Software, vol.33, issue.1, pp.1-22, 2010.
DOI : 10.18637/jss.v033.i01

A. Garivier and E. Kaufmann, Optimal best arm identification with fixed confidence
URL : https://hal.archives-ouvertes.fr/hal-01273838

Y. Goldberg, R. Song, D. Zeng, and M. R. Kosorok, Comment on ???Dynamic treatment regimes: Technical challenges and applications???, Electronic Journal of Statistics, vol.8, issue.1, pp.1290-1300, 2014.
DOI : 10.1214/14-EJS905

E. Kaufmann and T. Paristech, Analyse de stratgies baysiennes et frquentistes pour l'allocation squentielle de ressources, 2014.

E. B. Laber, D. J. Lizotte, M. Qian, W. E. Pelham, and S. A. Murphy, Dynamic treatment regimes: Technical challenges and applications, Electronic Journal of Statistics, vol.8, issue.1, pp.1225-1272, 2014.
DOI : 10.1214/14-EJS920

E. B. Laber, D. J. Lizotte, M. Qian, W. E. Pelham, and S. A. Murphy, Rejoinder of ???Dynamic treatment regimes: Technical challenges and applications???, Electronic Journal of Statistics, vol.8, issue.1, pp.1312-1321, 2014.
DOI : 10.1214/14-EJS920REJ

A. R. Luedtke and M. J. Van-der-laan, Targeted learning of the mean outcome under an optimal dynamic treatment rule, Journal of Causal Inference, vol.3, issue.1, pp.61-95, 2015.

A. R. Luedtke and M. J. Van-der-laan, Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy, The Annals of Statistics, vol.44, issue.2, 2015.
DOI : 10.1214/15-AOS1384SUPP

A. R. Luedtke and M. J. Van-der-laan, Abstract, The International Journal of Biostatistics, vol.12, issue.1, 2016.
DOI : 10.1515/ijb-2015-0052

E. Mammen and A. B. Tsybakov, Smooth discrimination analysis, Ann. Statist, vol.27, issue.6, pp.1808-1829, 1999.

J. Pearl, Causality: Models, Reasoning and Inference, 2000.
DOI : 10.1017/CBO9780511803161

M. Qian and S. A. Murphy, Performance guarantees for individualized treatment rules, The Annals of Statistics, vol.39, issue.2, pp.1180-121010, 2011.
DOI : 10.1214/10-AOS864SUPP

J. M. Robins, Optimal Structural Nested Models for Optimal Sequential Decisions, Proc. Second Seattle Symp, pp.189-326, 2004.
DOI : 10.1007/978-1-4419-9076-1_11

J. M. Robins and A. Rotnitzky, Discussion of ???Dynamic treatment regimes: Technical challenges and applications???, Electronic Journal of Statistics, vol.8, issue.1, pp.1273-1289, 2014.
DOI : 10.1214/14-EJS908

D. B. Rubin and M. J. Van-der-laan, Statistical Issues and Limitations in Personalized Medicine Research with Clinical Trials, The International Journal of Biostatistics, vol.8, issue.1, p.2012
DOI : 10.1515/1557-4679.1423

P. K. Sen and J. M. Singer, Large sample methods in statistics, 1993.
DOI : 10.1007/978-1-4899-4491-7

M. J. Van-der-laan and S. Rose, Targeted learning Springer Series in Statistics, Causal inference for observational and experimental data, pp.978-979, 2011.

M. J. Van-der-laan and D. Rubin, Targeted Maximum Likelihood Learning, The International Journal of Biostatistics, vol.2, issue.1, 2006.
DOI : 10.2202/1557-4679.1043

A. W. Van and . Vaart, Asymptotic statistics, volume 3 of Cambridge Series in Statistical and Probabilistic Mathematics, 1998.

A. W. Van-der-vaart and J. A. Wellner, Weak Convergence, 1996.
DOI : 10.1007/978-1-4757-2545-2_3

B. Zhang, A. Tsiatis, M. Davidian, M. Zhang, and E. Laber, A Robust Method for Estimating Optimal Treatment Regimes, Biometrics, vol.28, issue.4, pp.1010-1018, 2012.
DOI : 10.1111/j.1541-0420.2012.01763.x

B. Zhang, A. Tsiatis, M. Davidian, M. Zhang, and E. Laber, Estimating optimal treatment regimes from a classification perspective, Stat, vol.68, issue.1, pp.103-114, 2012.

Y. Zhao, D. Zeng, A. J. Rush, and M. R. Kosorok, Estimating Individualized Treatment Rules Using Outcome Weighted Learning, Journal of the American Statistical Association, vol.18, issue.1, pp.1106-1118
DOI : 10.1080/01621459.2012.695674

Y. Zhao, D. Zeng, E. B. Laber, and M. R. Kosorok, New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes, Journal of the American Statistical Association, vol.110, issue.510, pp.583-598
DOI : 10.1080/01621459.2012.695674

W. Zheng, A. Chambaz, and M. J. Van-der-laan, Drawing valid targeted inference when covariate-adjusted response-adaptive rct meets data-adaptive loss-based estimation , with an application to the LASSO, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01180719