Skip to Main content Skip to Navigation
Poster communications

Audio-visual emotion recognition: A dynamic, multimodal approach

Résumé : Designing systems able to interact with students in a natural manner is a complex and far from solved problem. A key aspect of natural interaction is the ability to understand and appropriately respond to human emotions. This paper details our response to the continuous Audio/Visual Emotion Challenge (AVEC'12) whose goal is to predict four affective signals describing human emotions. The proposed method uses Fourier spectra to extract multi-scale dynamic descriptions of signals characterizing face appearance, head movements and voice. We perform a kernel regression with very few representative samples selected via a supervised weighted-distance-based clustering, that leads to a high generalization power. We also propose a particularly fast regressor-level fusion framework to merge systems based on different modalities. Experiments have proven the efficiency of each key point of the proposed method and our results on challenge data were the highest among 10 international research teams.
Document type :
Poster communications
Complete list of metadata

Cited literature [29 references]  Display  Hide  Download
Contributor : Ihm14 Ihm14 Connect in order to contact the contributor
Submitted on : Tuesday, December 2, 2014 - 9:26:24 AM
Last modification on : Wednesday, May 19, 2021 - 12:12:52 PM
Long-term archiving on: : Tuesday, March 3, 2015 - 10:35:57 AM


Files produced by the author(s)


  • HAL Id : hal-01089628, version 1


Jérémie Nicolle, Vincent Rapp, Kevin Bailly, Lionel Prevost, Mohamed Chetouani. Audio-visual emotion recognition: A dynamic, multimodal approach. IHM'14, 26e conférence francophone sur l'Interaction Homme-Machine, Oct 2014, Lille, France. pp.44-51, 2014. ⟨hal-01089628⟩



Record views


Files downloads