A. Bapna, G. Tur, D. Hakkani-tur, and L. Heck, Towards zero-shot frame semantic parsing for domain scaling, 2017.

J. Bian, B. Gao, and T. Liu, Knowledge-powered deep learning for word embedding, ECML, 2014.

I. Casanueva, P. Budzianowski, P. Su, N. Mrk?i?, T. Wen et al., A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management, 2017.

T. Chaminade, An experimental approach to study the physiology of natural social interactions, Interaction Studies, vol.18, issue.2, pp.254-276, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01585223

L. Daubigney, M. Geist, S. Chandramohan, and O. Pietquin, A comprehensive reinforcement learning framework for dialogue management optimization, Selected Topics in Signal Processing, vol.6, issue.8, pp.891-902, 2012.
DOI : 10.1109/jstsp.2012.2229257

Y. Dauphin, G. Tur, D. Hakkani-tur, and L. Heck, Zero-shot learning and clustering for semantic utterance classification, 2014.

A. Deoras and R. Sarikaya, Deep belief network based semantic taggers for spoken language understanding, INTERSPEECH, 2013.

B. Dhingra, L. Li, X. Li, J. Gao, Y. Chen et al., Towards end-to-end reinforcement learning of dialogue agents for information access, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol.1, pp.484-495, 2017.

E. Ferreira, B. Jabaian, and F. Lefèvre, Online adaptative zero-shot learning spoken language understanding using word-embedding, ICASSP, 2015.
DOI : 10.1109/icassp.2015.7178987

URL : https://hal.archives-ouvertes.fr/hal-02042298

E. Ferreira and F. Lefèvre, Expert-based reward shaping and exploration scheme for boosting policy learning of dialogue management, ASRU, 2013.

E. Ferreira and F. Lefèvre, Reinforcement-learning based dialogue system for human-robot interactions with socially-inspired rewards, Computer Speech & Language, vol.34, issue.1, pp.256-274, 2015.
DOI : 10.1016/j.csl.2015.03.007

E. Ferreira, A. Reiffers-masson, B. Jabaian, and F. Lefèvre, Adversarial bandit for online interactive active learning of zero-shot spoken language understanding, Proceedings of ICASSP, 2016.
URL : https://hal.archives-ouvertes.fr/hal-02041621

M. Ga?i?, C. Breslin, M. Henderson, D. Kim, M. Szummer et al., On-line policy optimisation of bayesian spoken dialogue systems via human interaction, IEEE ICASSP, pp.8367-8371, 2013.

M. Geist and O. Pietquin, Kalman temporal differences, Artificial Intelligence Research, vol.39, issue.1, pp.483-532, 2010.
DOI : 10.1613/jair.3077

URL : https://hal.archives-ouvertes.fr/hal-00351297

M. Geist and O. Pietquin, Managing uncertainty within value function approximation in reinforcement learning, Active Learning and Experimental Design workshop (collocated with AISTATS 2010), vol.92, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00554398

S. Hahn, M. Dinarelli, C. Raymond, F. Lefèvre, P. Lehnen et al., Comparing stochastic approaches to spoken language understanding in multiple languages, IEEE TASLP, vol.19, issue.6, pp.1569-1583, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00746965

X. Li, Y. Chen, L. Li, J. Gao, and A. Celikyilmaz, End-to-end task-completion neural dialogue systems, Proceedings of the Eighth International Joint Conference on Natural Language Processing, vol.1, pp.733-743, 2017.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient estimation of word representations in vector space, 2013.

A. Ng, D. Harada, and S. Russell, Policy invariance under reward transformations : Theory and application to reward shaping, ICML, 1999.

M. Riou, B. Jabaian, S. Huet, and F. Lefèvre, Online adaptation of an attention-based neural network for natural language generation, Proceedings of INTERSPEECH, 2017.
URL : https://hal.archives-ouvertes.fr/hal-02021901

M. Riou, B. Jabaian, S. Huet, and F. Lefèvre, Joint On-line Learning of a Zero-shot Spoken Semantic Parser and a Reinforcement Learning Dialogue Manager, IEEE ICASSP, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02024691

P. Shah, D. Hakkani-tur, B. Liu, and G. Tur, Bootstrapping a neural conversational agent with dialogue self-play, crowdsourcing and on-line reinforcement learning, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics : Human Language Technologies, vol.3, pp.41-51, 2018.

S. Upadhyay, M. Faruqui, G. Tür, H. Dilek, and L. Heck, almost) zero-shot cross-lingual spoken language understanding, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.6034-6038, 2018.
DOI : 10.1109/icassp.2018.8461905

T. Wen, D. Vandyke, N. Mrk?i?, M. Gasic, L. M. Rojas-barahona et al., A network-based end-to-end trainable task-oriented dialogue system, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol.1, pp.438-449, 2017.
DOI : 10.18653/v1/e17-1042

URL : https://doi.org/10.18653/v1/e17-1042

S. Young, M. Ga?i?, S. Keizer, F. Mairesse, J. Schatzmann et al., The hidden information state model : A practical framework for pomdp-based spoken dialogue management, Computer Speech and Language, vol.24, issue.2, pp.150-174, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00598186

T. Zhao and M. Eskenazi, Zero-shot dialog generation with cross-domain latent actions, 2018.