Multi-corpus Experiment on Continuous Speech Emotion Recognition: Convolution or Recurrence? - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Multi-corpus Experiment on Continuous Speech Emotion Recognition: Convolution or Recurrence?

Résumé

Extraction of semantic information from real-life speech, such as emotions, is a challenging task that has grown in popularity over the last few years. Recently, emotion processing in speech moved from discrete emotional categories to continuous affective dimensions. This trend helps in the design of systems that predict the dynamic evolution of affect in speech. However, no standard annotation guidelines exist for these dimensions thus making cross-corpus studies hard to achieve. Deep neural networks are nowadays predominant in the task of emotion recognition. Almost all systems use recurrent architectures, but convolutional networks were recently reassessed as they are faster to train and have less parameters than recurrent ones. This paper aims at investigating pros and cons of the aforementioned architectures using cross-corpus experiments to highlight the issue of corpus variability. We also explore the best suitable acoustic representation for continuous emotion, together with loss functions. We concluded that recurrent networks are robust to corpus variability and we confirm the power of cepstral features for continuous Speech Emotion Recognition(SER), especially for satisfaction prediction. A final post-treatment applied on prediction brings very nice result (ccc = 0.719) on AlloSat and achieves new state of the art.
Fichier principal
Vignette du fichier
SPECOM(1).pdf (424.61 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02945644 , version 1 (07-12-2020)

Identifiants

  • HAL Id : hal-02945644 , version 1

Citer

Manon Macary, Martin Lebourdais, Marie Tahon, Yannick Estève, Anthony Rousseau. Multi-corpus Experiment on Continuous Speech Emotion Recognition: Convolution or Recurrence?. 22ND INTERNATIONAL CONFERENCE ON SPEECH AND COMPUTER SPECOM 2020, Oct 2020, St Petersburg, Russia. ⟨hal-02945644⟩
204 Consultations
330 Téléchargements

Partager

Gmail Facebook X LinkedIn More