Developmental Learning of Audio-Visual Integration From Facial Gestures Of a Social Robot

Oriane Dermy 1 Sofiane Boucenna 2, 3 Alexandre Pitti 3 Arnaud Blanchard 3
1 LARSEN - Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
2 Neurocybernétique
ETIS - Equipes Traitement de l'Information et Systèmes
Abstract : We present a robot head with facial gestures, audio and vision capabilities toward the emergence of infant-like social features. For this, we propose a neural architecture that integrates these three modalities following a developmental stage with social interaction with a caregiver. During dyadic interaction with the experimenter, the robot learns to categorize audio-speech gestures of vowels /a/, /i/, /o/ as a baby would do it, by linking someone-else facial expressions to its own movements. We show that multimodal integration in the neural network is more robust than unimodal learning so that it compensates erroneous or noisy information coming from each modality. Therefore, facial mimicry with a partner can be reproduced using redundant audiovisual signals or noisy information from one modality only. Statistical experiments on 24 naive participants show the robustness of our algorithm during human-robot interactions in public environment where many people move and talk all the time. We then discuss our model in the light of human-robot communication, the development of social skills and language in infants.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Cited literature [33 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02185423
Contributor : Alexandre Pitti <>
Submitted on : Friday, July 26, 2019 - 4:22:37 PM
Last modification on : Monday, August 12, 2019 - 3:39:57 PM

Identifiers

  • HAL Id : hal-02185423, version 1

Citation

Oriane Dermy, Sofiane Boucenna, Alexandre Pitti, Arnaud Blanchard. Developmental Learning of Audio-Visual Integration From Facial Gestures Of a Social Robot. 2019. ⟨hal-02185423⟩

Share

Metrics

Record views

210

Files downloads

33