A. Bordes, Y. Boureau, and J. Weston, Learning End-to-End Goal-Oriented Dialog, 5th International Conference on Learning Representations, 2017.

D. Cai, Y. Wang, W. Bi, Z. Tu, X. Liu et al., Retrieval-guided Dialogue Response Generation via a Matching-to-Generation Framework, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp.1866-1875, 2019.

S. Chandar, S. Ahn, H. Larochelle, P. Vincent, G. Tesauro et al., Hierarchical Memory Networks. CoRR, 2016.

D. Chen, A. Fisch, J. Weston, and A. Bordes, Reading Wikipedia to Answer Open-Domain Questions, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol.1, pp.1870-1879, 2017.

W. Chen, D. Grangier, and M. Auli, Strategies for Training Large Vocabulary Neural Language Models, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol.1, pp.1975-1985, 2016.

E. Dinan, S. Roller, K. Shuster, A. Fan, M. Auli et al., Wizard of Wikipedia: Knowledge-Powered Conversational Agents, International Conference on Learning Representations, 2018.

A. Fan, C. Gardent, C. Braud, and A. Bordes, Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Inputs, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp.4177-4187, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02277063

A. Fan, D. Grangier, and M. Auli, Controllable Abstractive Summarization, Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pp.45-54, 2018.

A. Fan, M. Lewis, and Y. Dauphin, Hierarchical Neural Story Generation, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol.1, pp.889-898, 2018.

E. Grave, A. Joulin, M. Cissé, D. Grangier, and H. Jégou, Efficient Softmax Approximation for GPUs, Proceedings of the 34th International Conference on Machine Learning, vol.70, pp.1302-1310, 2017.

E. Grave, A. Joulin, and N. Usunier, Improving Neural Language Models with a Continuous Cache, 5th International Conference on Learning Representations, 2017.

A. Graves, G. Wayne, and I. Danihelka, Preprint repository arXiv achieves milestone million uploads, Physics Today, 2014.

K. Guu, K. Lee, Z. Tung, P. Pasupat, and M. Chang, Retrieval Augmented Language Model Pre-Training, Proceedings of the International Conference on Machine Learning, pp.5695-5704, 2020.

S. Humeau, K. Shuster, M. Lachaux, and J. Weston, Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring, International Conference on Learning Representations, 2019.

J. Johnson, M. Douze, and H. Jegou, Billion-scale similarity search with GPUs, IEEE Transactions on Big Data, pp.1-1, 2019.

A. Joulin and T. Mikolov, Algorithmic Luckiness, Advances in Neural Information Processing Systems 14, pp.190-198, 2002.

U. Khandelwal, O. Levy, D. Jurafsky, L. Zettlemoyer, and M. Lewis, Learning Embeddings for Fast Approximate Nearest Neighbor Retrieval, Nearest-Neighbor Methods in Learning and Vision, 2006.

P. Diederik, J. Kingma, and . Ba, Adam: A Method for Stochastic Optimization, 3rd International Conference on Learning Representations, 2015.

G. Lample, M. Ott, A. Conneau, L. Denoyer, and M. Ranzato, Phrase-Based & Neural Unsupervised Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp.8548-8559, 2018.

M. Li, J. Weston, and S. Roller, Preprint repository arXiv achieves milestone million uploads, Physics Today, 2014.

R. Lian, M. Xie, F. Wang, J. Peng, and H. Wu, Learning to Select Knowledge for Response Generation in Dialog Systems, Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, pp.5081-5087, 2019.

D. Mahajan, R. Girshick, V. Ramanathan, K. He, M. Paluri et al., Exploring the Limits of Weakly Supervised Pretraining, Computer Vision ? ECCV 2018, pp.185-201, 2018.

B. Mccann, J. Bradbury, C. Xiong, and R. Socher, Learned in Translation: Contextualized Word Vectors, Advances in Neural Information Processing Systems, pp.6294-6305, 2017.

A. Miller, W. Feng, D. Batra, A. Bordes, A. Fisch et al., ParlAI: A Dialog Research Software Platform, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp.79-84, 2017.

A. Mnih, Z. Yuecheng, and G. Hinton, Improving a statistical language model through non-linear prediction, Neurocomputing, vol.72, issue.7-9, pp.1414-1418, 2009.

F. Petroni, T. Rocktäschel, S. Riedel, P. Lewis, A. Bakhtin et al., Language Models as Knowledge Bases?, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp.2463-2473, 2019.

T. Plötz and S. Roth, Neural Nearest Neighbors Networks, Advances in Neural Information Processing Systems, pp.1087-1098, 2018.

L. Qin, M. Galley, C. Brockett, X. Liu, X. Gao et al., Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.5427-5436, 2019.

J. Rae, J. J. Hunt, I. Danihelka, T. Harley, A. W. Senior et al., Associative memory in realistic neuronal networks, Advances in Neural Information Processing Systems 14, pp.3621-3629, 2002.

R. Sennrich, B. Haddow, and A. Birch, Neural Machine Translation of Rare Words with Subword Units, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol.1, pp.1715-1725, 2016.

M. Seo, J. Lee, T. Kwiatkowski, A. Parikh, A. Farhadi et al., Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.4430-4441, 2019.

V. Iulian, R. Serban, L. Lowe, J. Charlin, and . Pineau, Generative deep neural networks for dialogue: A short review, 2016.

V. Iulian, A. Serban, Y. Sordoni, A. Bengio, J. Courville et al., Building End-to-End Dialogue Systems using Generative Hierarchical Neural Network Models, Thirtieth AAAI Conference on Artificial Intelligence, 2016.

K. Shuster, S. Humeau, A. Bordes, and J. Weston, Image-Chat: Engaging Grounded Conversations, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.2414-2429, 2020.

K. Shuster, S. Humeau, H. Hu, A. Bordes, and J. Weston, Engaging Image Captioning via Personality, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.12516-12526, 2019.

H. Song, Y. Wang, W. Zhang, X. Liu, and T. Liu, Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020.

Y. Song, R. Yan, X. Li, D. Zhao, and M. Zhang, Preprint repository arXiv achieves milestone million uploads, Physics Today, 2014.

S. Sukhbaatar, E. Grave, P. Bojanowski, and A. Joulin, Adaptive Attention Span in Transformers, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019.

S. Sukhbaatar, J. Weston, and R. Fergus, End-to-End Memory Networks, Advances in neural information processing systems, pp.2440-2448, 2015.

B. Thomee, D. A. Shamma, G. Friedland, B. Elizalde, K. Ni et al., YFCC100M, Communications of the ACM, vol.59, issue.2, pp.64-73, 2016.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones et al., Attention is All You Need, Advances in Neural Information Processing Systems, pp.5998-6008, 2017.

J. Weston, S. Chopra, and A. Bordes, Memory Networks, 3rd International Conference on Learning Representations, 2015.

J. Weston, E. Dinan, and A. Miller, Retrieve and Refine: Improved Sequence Generation Models For Dialogue, Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI, pp.87-92, 2018.

S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, Aggregated Residual Transformations for Deep Neural Networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1492-1500, 2017.

S. Zhang, E. Dinan, J. Urbanek, A. Szlam, D. Kiela et al., Personalizing Dialogue Agents: I have a dog, do you have pets too?, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol.1, pp.2204-2213, 2018.

Y. Zhu, Z. Dou, J. Nie, and J. Wen, ReBoost: a retrieval-boosted sequence-to-sequence model for neural response generation, Information Retrieval Journal, vol.23, issue.1, pp.27-48, 2019.