Skip to Main content Skip to Navigation
Journal articles

Training universal background models with restricted data for speech emotion recognition

Abstract : Speech emotion recognition (SER) is an important research topic which relies heavily on emotional data. Admitting that SER has seen some recent advancements, Universal Background Model (UBM), a standard reference concept from a neighbouring field which is speaker recognition, has always been the base module for the newly developed methods such as Joint Factor Analysis. Theoretically, UBM is a Gaussian model trained with an extensive and representative set of speech samples recorded from different target classes in order to extract general feature characteristics. Obtaining large amount of emotional data to train UBM is a challenging task, further complicated by the cost of annotations and the ambiguity of resulting labels. In addition, it’s dependent upon the training data. In this paper, we make preliminary exploration on a new approach: Training UBM models, named as restricted UBM, with a small amount of speech, which can be even different from the training data. Experiments show that this approach results in a domain-independent UBM capable of designing an acoustic model transferable to different datasets. Four standard benchmark speech databases from different languages have been used for the experimental evaluation. The results show that our proposed model outperforms the existing state of the art baselines. Moreover, we applied this approach on emotional speaker recognition.
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03194458
Contributor : Filipo STUDZINSKI PEROTTO Connect in order to contact the contributor
Submitted on : Friday, April 9, 2021 - 3:21:16 PM
Last modification on : Monday, July 4, 2022 - 10:21:32 AM

Identifiers

Citation

Imen Trabelsi, Filipo Studzinski Perotto, Usman Malik. Training universal background models with restricted data for speech emotion recognition. Journal of Ambient Intelligence and Humanized Computing, Springer, 2021, ⟨10.1007/s12652-021-03200-1⟩. ⟨hal-03194458⟩

Share

Metrics

Record views

57