Reconstruction-error-based learning for continuous emotion recognition in speech

Abstract : To advance the performance of continuous emotion recognition from speech, we introduce a reconstruction-error-based (RE-based) learning framework with memory-enhanced Recurrent Neural Networks (RNN). In the framework, two successive RNN models are adopted, where the first model is used as an autoencoder for reconstructing the original features, and the second is employed to perform emotion prediction. The RE of the original features is used as a complementary descriptor, which is merged with the original features and fed to the second model. The assumption of this framework is that the system has the ability to learn its 'drawback' which is expressed by the RE. Experimental results on the RECOLA database show that the proposed framework significantly outperforms the baseline systems without any RE information in terms of Concordance Correlation Coefficient (.729 vs .710 for arousal, .360 vs .237 for valence), and also significantly overcomes other state-of-the-art methods.
Liste complète des métadonnées

Cited literature [25 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01494058
Contributor : Fabien Ringeval <>
Submitted on : Wednesday, March 22, 2017 - 4:02:02 PM
Last modification on : Tuesday, February 12, 2019 - 1:31:21 AM
Document(s) archivé(s) le : Friday, June 23, 2017 - 1:51:55 PM

File

Han17-RLF.pdf
Files produced by the author(s)

Licence


Public Domain

Identifiers

  • HAL Id : hal-01494058, version 1

Collections

Citation

Jing Han, Zixing Zhang, Fabien Ringeval, Björn Schuller. Reconstruction-error-based learning for continuous emotion recognition in speech. Proceedings of the 42nd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2017, La Nouvelle Orléans (LA), United States. ⟨hal-01494058⟩

Share

Metrics

Record views

486

Files downloads

332