Skip to Main content Skip to Navigation
Conference papers

Multi-corpus Experiment on Continuous Speech Emotion Recognition: Convolution or Recurrence?

Abstract : Extraction of semantic information from real-life speech, such as emotions, is a challenging task that has grown in popularity over the last few years. Recently, emotion processing in speech moved from discrete emotional categories to continuous affective dimensions. This trend helps in the design of systems that predict the dynamic evolution of affect in speech. However, no standard annotation guidelines exist for these dimensions thus making cross-corpus studies hard to achieve. Deep neural networks are nowadays predominant in the task of emotion recognition. Almost all systems use recurrent architectures, but convolutional networks were recently reassessed as they are faster to train and have less parameters than recurrent ones. This paper aims at investigating pros and cons of the aforementioned architectures using cross-corpus experiments to highlight the issue of corpus variability. We also explore the best suitable acoustic representation for continuous emotion, together with loss functions. We concluded that recurrent networks are robust to corpus variability and we confirm the power of cepstral features for continuous Speech Emotion Recognition(SER), especially for satisfaction prediction. A final post-treatment applied on prediction brings very nice result (ccc = 0.719) on AlloSat and achieves new state of the art.
Complete list of metadata
Contributor : Marie Tahon Connect in order to contact the contributor
Submitted on : Monday, December 7, 2020 - 10:06:59 AM
Last modification on : Wednesday, September 22, 2021 - 11:26:05 AM
Long-term archiving on: : Monday, March 8, 2021 - 6:25:37 PM


Files produced by the author(s)


  • HAL Id : hal-02945644, version 1



Manon Macary, Martin Lebourdais, Marie Tahon, Yannick Estève, Anthony Rousseau. Multi-corpus Experiment on Continuous Speech Emotion Recognition: Convolution or Recurrence?. 22ND INTERNATIONAL CONFERENCE ON SPEECH AND COMPUTER SPECOM 2020, Oct 2020, St Petersburg, Russia. ⟨hal-02945644⟩



Record views


Files downloads