J. Cassell, H. Vilhjalmsson, and M. Steedman, BEAT, Proceedings of the 28th annual conference on Computer graphics and interactive techniques , SIGGRAPH '01, pp.477-486, 2001.
DOI : 10.1145/383259.383315

J. Lee and S. C. Marsella, Nonverbal Behavior Generator for Embodied Conversational Agents, International Conference on Intelligent Virtual Agents (IVA), pp.243-255, 2006.
DOI : 10.1007/11821830_20

K. Thórisson, Natural turn-taking needs no manual: Computational theory and model, from perception to action.Multimodality in language and speech systems, pp.173-207, 2002.

K. Otsuka, H. Sawada, and J. Yamato, Automatic Inference of Cross-modal Nonverbal Interactions in Multiparty Conversations from Gaze, Head Gestures, and Utterances " Who Responds to Whom, When, and How?, International Conference on Multimodal Interfaces (ICMI), pp.255-262, 2007.

A. Mihoub, G. Bailly, and C. Wolf, Learning multimodal behavioral models for face-to-face social interaction, Journal on Multimodal User Interfaces, vol.10, issue.8, pp.195-210, 2015.
DOI : 10.1093/acprof:oso/9780199231751.003.0004
URL : https://hal.archives-ouvertes.fr/hal-01170991

A. Mihoub, G. Bailly, C. Wolf, and F. Elisei, Graphical models for social behavior modeling in face-to face interaction, Pattern Recognition Letters, vol.74, pp.82-89, 2016.
DOI : 10.1016/j.patrec.2016.02.005
URL : https://hal.archives-ouvertes.fr/hal-01279427

M. Vrigkas, C. Nikou, and I. A. Kakadiaris, A review of human activity recognition methods, Frontiers in Robotics and AI, 2015.

A. Liu, N. Xu, W. Nie, Y. Su, Y. Wong et al., Benchmarking a Multimodal and Multiview and Interactive Dataset for Human Action Recognition, IEEE Transactions on Cybernetics, vol.47, issue.7, 2016.
DOI : 10.1109/TCYB.2016.2582918

K. Noda, H. Arie, Y. Suga, and T. Ogata, Multimodal integration learning of robot behavior using deep neural networks, Robotics and Autonomous Systems, vol.62, issue.6, pp.721-736, 2014.
DOI : 10.1016/j.robot.2014.03.003

D. Vogt, H. B. Amor, E. Berger, and B. Jung, Learning twoperson interaction models for responsive synthetic humanoids, Journal of Virtual Reality and Broadcasting, vol.11, issue.1, 2014.

W. De-mulder, S. Bethard, and M. Moens, A survey on the application of recurrent neural networks to statistical language modeling, Computer Speech & Language, vol.30, issue.1, pp.30-61, 2015.
DOI : 10.1016/j.csl.2014.09.005

I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, Proceedings of the International Conference on Neural Information Processing Systems (NIPS), pp.3104-3112, 2014.

A. Karpathy and L. Fei-fei, Deep visual-semantic alignments for generating image descriptions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3128-3137, 2015.

F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, Learning precise timing with LSTM recurrent networks, Journal of Machine Learning Research, vol.3, pp.115-143, 2002.

F. J. Ordóñez and D. Roggen, Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition, Sensors, vol.115, p.16, 2016.

E. Tsironi, P. Barros, and S. Wermter, An analysis of Convolutional Long Short-Term Memory Recurrent Neural Networks for gesture recognition, Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), pp.213-218, 2016.
DOI : 10.1016/j.neucom.2016.12.088

]. L. Tian, J. D. Moore, and C. Lai, Emotion recognition in spontaneous and acted dialogues, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), pp.698-704, 2015.
DOI : 10.1109/ACII.2015.7344645

A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-fei et al., Human trajectory prediction in crowded spaces, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.961-971, 2016.

H. C. Ravichandar, A. Kumar, A. Dani, and K. R. Pattipati, Learning and Predicting Sequential Tasks Using Recurrent Neural Networks and Multiple Model Filtering, AAAI Fall Symposium Series, p.14105, 2016.

P. Wittenburg, H. Brugman, A. Russel, A. Klassmann, and H. Sloetjes, Elan: a professional framework for multimodality research, Proceedings of the International Conference on Language Resources and Evaluation (LREC), pp.1556-1559, 2006.

P. P. Boersma, Praat, a system for doing phonetics by computer, Glot International, pp.341-345, 2002.

M. Dunham and K. Murphy, PMTK3: Probabilistic modeling toolkit for Matlab/Octave, 2012.

G. F. Cooper and E. Herskovits, A Bayesian method for the induction of probabilistic networks from data, Machine Learning, vol.72, issue.4, pp.309-347, 1992.
DOI : 10.1007/978-1-4613-2283-2

S. Liang, S. Fuhrman, and R. Somogyi, Reveal, a general reverse engineering algorithm for inference of genetic network architectures, Pacific Symposium on Biocomputing, pp.18-29, 1998.

A. Graves, N. Jaitly, and A. Mohamed, Hybrid speech recognition with Deep Bidirectional LSTM, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp.273-278, 2013.
DOI : 10.1109/ASRU.2013.6707742

R. Brueckner and B. Schulter, Social signal classification using deep blstm recurrent neural networks, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4823-4827, 2014.
DOI : 10.1109/ICASSP.2014.6854518

M. Zhang and Z. Zhou, A Review on Multi-Label Learning Algorithms, IEEE Transactions on Knowledge and Data Engineering, vol.26, issue.8, pp.1819-1837, 2014.
DOI : 10.1109/TKDE.2013.39

L. Yujian and L. Bo, A Normalized Levenshtein Distance Metric, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, issue.6, pp.1091-1095, 2007.
DOI : 10.1109/TPAMI.2007.1078

. Nguyen, . Duc-canh, . Bailly, . Gérard, and F. Elisei, Conducting neuropsychological tests with a humanoid robot: Design and evaluation, 2016 7th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), pp.337-342, 2016.
DOI : 10.1109/CogInfoCom.2016.7804572
URL : https://hal.archives-ouvertes.fr/hal-01385666

D. C. Richardson, R. Dale, and K. Shockley, Synchrony and swing in conversation: coordination, temporal dynamics, and communication, Embodied Communication, pp.75-93, 2008.
DOI : 10.1093/acprof:oso/9780199231751.003.0004