B. M. Lake, T. D. Ullman, J. B. Tenenbaum, and S. J. Gershman, Building machines that learn and think like people, Behavioral and Brain Sciences, vol.40, p.253, 2017.

H. Siegelmann and E. Sontag, On the computational power of neural nets, Journal of Computer and System Sciences, vol.50, issue.1, pp.132-150, 1995.

A. Graves, G. Wayne, and I. Danihelka, Neural turing machines, CoRR, 2014.

M. Collier and J. Beel, Implementing neural turing machines, Artificial Neural Networks and Machine Learning -ICANN 2018, pp.94-104, 2018.

Y. Hoshen and S. Peleg, Visual learning of arithmetic operations, Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, ser. AAAI'16, pp.3733-3739, 2016.

I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning, vol.1, 2016.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp.1097-1105, 2012.

S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural computation, vol.9, issue.8, pp.1735-1780, 1997.

K. Cho, B. Van-merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares et al., Learning phrase representations using rnn encoder-decoder for statistical machine translation, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01433235

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness et al., Human-level control through deep reinforcement learning, Nature, vol.518, issue.7540, p.529, 2015.

G. Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems, vol.2, issue.4, pp.303-314, 1989.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient estimation of word representations in vector space, CoRR, 2013.

J. Pennington, R. Socher, and C. Manning, Glove: Global vectors for word representation, Empirical Methods in Natural Language Processing, pp.1532-1543, 2014.

S. Sukhbaatar, J. Weston, and R. Fergus, End-to-end memory networks, Advances in Neural Information Processing Systems, vol.28, pp.2440-2448, 2015.

I. Sutskever, O. Vinyals, Q. V. Le, ;. Ghahramani, M. Welling et al., Sequence to sequence learning with neural networks, Advances in Neural Information Processing Systems, vol.27

K. Q. Lawrence and . Weinberger, , pp.3104-3112, 2014.

K. Cho, B. Van-merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares et al., Learning phrase representations using RNN encoder-decoder for statistical machine translation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.1724-1734, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01433235

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones et al., Attention is all you need, Advances in Neural Information Processing Systems, vol.30, pp.5998-6008, 2017.

A. Trask, F. Hill, S. E. Reed, J. Rae, C. Dyer et al., Neural arithmetic logic units, Advances in Neural Information Processing Systems, vol.31, pp.8035-8044, 2018.

A. Madsen and A. R. Johansen, Neural arithmetic units, International Conference on Learning Representations, 2020.

K. Chen, Y. Dong, X. Qiu, and Z. Chen, Neural arithmetic expression calculator, CoRR, 2018.

B. Settles, Active learning literature survey, 2009.

Y. Burda, H. Edwards, D. Pathak, A. Storkey, T. Darrell et al., Large-scale study of curiosity-driven learning, 2018.

Y. Bengio, J. Louradour, R. Collobert, and J. Weston, Curriculum learning, Proceedings of the 26th annual international conference on machine learning, pp.41-48, 2009.

J. Gottlieb, P. Oudeyer, M. Lopes, and A. Baranes, Informationseeking, curiosity, and attention: computational and neural mechanisms, Trends in Cognitive Sciences, vol.17, pp.585-593, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00913646

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, the 3rd International Conference for Learning Representations, vol.6980, 2014.