Skip to Main content Skip to Navigation
Poster communications

Audio-visual emotion recognition: A dynamic, multimodal approach

Résumé : Designing systems able to interact with students in a natural manner is a complex and far from solved problem. A key aspect of natural interaction is the ability to understand and appropriately respond to human emotions. This paper details our response to the continuous Audio/Visual Emotion Challenge (AVEC'12) whose goal is to predict four affective signals describing human emotions. The proposed method uses Fourier spectra to extract multi-scale dynamic descriptions of signals characterizing face appearance, head movements and voice. We perform a kernel regression with very few representative samples selected via a supervised weighted-distance-based clustering, that leads to a high generalization power. We also propose a particularly fast regressor-level fusion framework to merge systems based on different modalities. Experiments have proven the efficiency of each key point of the proposed method and our results on challenge data were the highest among 10 international research teams.
Complete list of metadatas

Cited literature [29 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01089628
Contributor : Ihm14 Ihm14 <>
Submitted on : Tuesday, December 2, 2014 - 9:26:24 AM
Last modification on : Thursday, January 23, 2020 - 12:04:08 AM
Document(s) archivé(s) le : Tuesday, March 3, 2015 - 10:35:57 AM

File

p44-nicole.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01089628, version 1

Citation

Jeremie Nicole, Vincent Rapp, Kevin Bailly, Lionel Prevost, Mohamed Chetouani. Audio-visual emotion recognition: A dynamic, multimodal approach. IHM'14, 26e conférence francophone sur l'Interaction Homme-Machine, Oct 2014, Lille, France. pp.44-51, 2014. ⟨hal-01089628⟩

Share

Metrics

Record views

303

Files downloads

318