Spectral and Cepstral Audio Noise Reduction Techniques in Speech Emotion Recognition

Jouni Pohjalainen; Fabien Ringeval; Zixing Zhang; Björn Schuller

doi:10.1145/2964284.2967306

Communication Dans Un Congrès Année : 2016

Spectral and Cepstral Audio Noise Reduction Techniques in Speech Emotion Recognition

(1) , (1, 2) , (1) , (1, 3)

1
2
3

Jouni Pohjalainen

Fonction : Auteur

Chair of Complex and Intelligent Systems

Fabien Ringeval

Fonction : Auteur
PersonId : 13134
IdHAL : fabien-ringeval
ORCID : 0000-0002-9213-4529
IdRef : 154573078

Chair of Complex and Intelligent Systems

Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole

Zixing Zhang

Fonction : Auteur

Chair of Complex and Intelligent Systems

Björn Schuller

Fonction : Auteur

Chair of Complex and Intelligent Systems

Department of Computing [London]

Résumé

Signal noise reduction can improve the performance of machine learning systems dealing with time signals such as audio. Real-life applicability of these recognition technologies requires the system to uphold its performance level in variable, challenging conditions such as noisy environments. In this contribution, we investigate audio signal denoising methods in cepstral and log-spectral domains and compare them with common implementations of standard techniques. The different approaches are first compared generally using averaged acoustic distance metrics. They are then applied to automatic recognition of spontaneous and natural emotions under simulated smartphone-recorded noisy conditions. Emotion recognition is implemented as support vector regression for continuous-valued prediction of arousal and valence on a realistic multimodal database. In the experiments, the proposed methods are found to generally outperform standard noise reduction algorithms.

Mots clés

Noise reduction Denoising Speech emotion recognition

Domaines

Traitement du signal et de l'image [eess.SP] Recherche d'information [cs.IR]

Fichier principal

tmp.pdf (367.69 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Fabien Ringeval : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01494062

Soumis le : mercredi 22 mars 2017-16:11:01

Dernière modification le : jeudi 4 avril 2024-20:57:58

Archivage à long terme le : vendredi 23 juin 2017-13:49:15

Dates et versions

hal-01494062 , version 1 (22-03-2017)

Licence

Domaine public

Identifiants

HAL Id : hal-01494062 , version 1
DOI : 10.1145/2964284.2967306

Citer

Jouni Pohjalainen, Fabien Ringeval, Zixing Zhang, Björn Schuller. Spectral and Cepstral Audio Noise Reduction Techniques in Speech Emotion Recognition. Proceedings of the 24th ACM International Conference on Multimedia (ACM MM), 2016, Amsterdam, Netherlands. pp.670 - 674, ⟨10.1145/2964284.2967306⟩. ⟨hal-01494062⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS LIG LIG_TDCGE_GETALP LIG_SIDCH

254 Consultations

2279 Téléchargements

Spectral and Cepstral Audio Noise Reduction Techniques in Speech Emotion Recognition

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager