. Agrawal, Learning to See by Moving, 2015 IEEE International Conference on Computer Vision (ICCV), pp.37-45, 2015.
DOI : 10.1109/ICCV.2015.13

. Ammirato, A dataset for developing and benchmarking active vision A survey of robot learning from demonstration, Int. Conf. on Robotics and Automation (ICRA), pp.1378-1385469, 2009.

. Azagra, A multimodal dataset for object model learning from natural human-robot interaction, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.6134-6141, 2017.
DOI : 10.1109/IROS.2017.8206514

URL : https://hal.archives-ouvertes.fr/hal-01567236

. Bailly, Gérard Bailly, Chrsitian Wolf Alaeddine Mihoub, and Frédéric Elisei. Gaze and face-to-face interaction: from multimodal data to behavioral models, 2018.

, Advances in Interaction Studies. Eye-tracking in interaction. Studies on the role of eye gaze in dialogue, John Benjamins, 2018.

. Bajcsy, Revisiting active perception Tadas Baltru?aitis, Chaitanya Ahuja, and Louis-Philippe Morency Multimodal machine learning: A survey and taxonomy The mahnob mimicry database: A database of naturalistic human interactions. Pattern recognition letters Interactive perception: Leveraging action in perception and perception in action, Autonomous Robots IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE Transactions on Robotics, vol.42, issue.336, pp.177-19652, 2015.

. Cambuzat, Immersive teleoperation of the eye gaze of social robots, Int. Symposium on Robotics (ISR), 2018.
URL : https://hal.archives-ouvertes.fr/hal-01779633

. Castellano, Inter-ACT, Proceedings of the international conference on Multimedia, MM '10, pp.1031-1034, 2010.
DOI : 10.1145/1873951.1874142

[. Kok, Learning and evaluating response prediction models using parallel listener consensus, Int. Conf. on Multimodal Interfaces and Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI), pp.1-3, 2010.

. Ding, Speech-driven head motion synthesis using neural networks, In Interspeech, pp.2303-2307, 2014.

. Goodrich, Teleoperation and Beyond for Assistive Humanoid Robots, Interspeech Int. conf. on Human-robot interaction (HRI), pp.175-226, 2004.
DOI : 10.1145/954339.954342

]. Kita, Pointing: Where language, culture, and cognition meet, 2003.

. Kopp, Towards a Common Framework for Multimodal Generation: The Behavior Markup Language, Int. workshop on intelligent virtual agents (IVA), pp.205-217, 2006.
DOI : 10.1007/11821830_17

P. B. Krenn, H. Krenn, and . Pirkerlee, Defining the gesticon: Language and gesture coordination for interacting embodied agents Developmental learning for autonomous robots The semaine database: Annotated multimodal records of emotionally colored conversations between a person and a limited agent, AISB Symposium on Language, Speech and Gesture for Expressive Characters, pp.107-115750, 2004.

. Nguyen, An evaluation framework to assess and correct the multimodal behavior of a humanoid robot in human-robot interaction Learning off-line vs. on-line models of interactive multimodal behaviors with recurrent neural networks Beaming into the rat world: enabling real-time interaction between rat and human each at their own scale, GEstures and SPeech in INteraction (GESPIN), pp.56-6229, 2012.

. Novoa, Robustness over time-varying channels in dnn-hmm asr based human-robot interaction Who will get the grant?: A multimodal corpus for the analysis of conversational behaviours in group interviews Action recognition: From static datasets to moving robots, Interspeech Workshop on Understanding and Modeling Multiparty, Multimodal Interactions Int. Conf. on Robotics and Automation (ICRA), pp.839-843, 2014.

D. Laurel and . Riek, Wizard of oz studies in hri: a systematic review and new reporting guidelines, Journal of Human-Robot Interaction, vol.1, issue.1, pp.119-136, 2012.

. Ruede, Enhancing Backchannel Prediction Using Word Embeddings, Interspeech 2017, pp.879-883, 2017.
DOI : 10.21437/Interspeech.2017-1606

. Scherer, Perception Markup Language: Towards a Standardized Representation of Perceived Nonverbal Behaviors, Int. Conf. on Intelligent Virtual Agents (IVA), pp.455-463, 2012.
DOI : 10.1007/978-3-642-33197-8_47

L. Sim, D. Y. , Y. Sim, and C. Loo, Extensive assessment and evaluation methodologies on assistive social robots for modelling human???robot interaction ??? A review, Information Sciences, vol.301, pp.305-344, 2015.
DOI : 10.1016/j.ins.2014.12.017

. Wang, Style tokens: Unsupervised style modeling, control and transfer in end-to-end speech synthesis. arXiv preprint, 2018.