Enhancement of Emotion Detection in Spoken Dialogue Systems by Combining Several Information Sources

Ramón López-Cózar; Jan Silovsky; Martin Kroul

doi:10.1016/j.specom.2011.01.006

Article Dans Une Revue Speech Communication Année : 2011

Enhancement of Emotion Detection in Spoken Dialogue Systems by Combining Several Information Sources

(1) , (2) , (2)

1
2

Ramón López-Cózar

Fonction : Auteur correspondant
PersonId : 873384

Connectez-vous pour contacter l'auteur

Faculty of Computer Science

Jan Silovsky

Fonction : Auteur

Faculty of Mechatronics

Martin Kroul

Fonction : Auteur

Faculty of Mechatronics

Résumé

This paper proposes a technique to enhance emotion detection in spoken dialogue systems by means of two modules that combine different information sources. The first one, called Fusion-0, combines emotion predictions generated by a set of classifiers that deal with different kinds of information about each sentence uttered by the user. To do this, the module employs several methods for information that produce other predictions about the emotional state of the user. The predictions are the input to the second information fusion module, called Fusion-1, where they are combined to deduce the emotional state of the user. Fusion-0 represents a method employed in previous studies to enhance classification rates, whereas Fusion-1 represents the novelty of the technique, which is the combination of emotion predictions generated by Fusion-0. One advantage of the technique is that it can be applied as a posterior processing stage to any other methods that combine information from different information sources at the decision level. This is so because the technique works on the predictions (outputs) of the methods, without interfering in the procedure used to obtain these predictions. Another advantage is that the technique can be implemented as a modular architecture, which facilitates the setting up within a spoken dialogue system as well as the deduction of the emotional state of the user in real time. Experiments have been carried out considering classifiers to deal with prosodic, acoustic, lexical and dialogue acts information, and three methods to combine information: multiplication of probabilities, average of probabilities and unweighted vote. The results show that the technique enhances the classification rates of the standard fusion by 2.27% and 3.38% absolute in experiments carried out considering two and three emotion categories, respectively.

Mots clés

Adaptive spoken dialogue systems combination of classifiers information fusion emotion detection human-computer interaction

Domaines

Linguistique

Fichier principal

PEER_stage2_10.1016%2Fj.specom.2011.01.006.pdf (939.13 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Hal Peer : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00779291

Soumis le : mardi 22 janvier 2013-03:51:32

Dernière modification le : lundi 12 décembre 2022-11:50:05

Archivage à long terme le : samedi 1 avril 2017-08:06:16

Dates et versions

hal-00779291 , version 1 (22-01-2013)

Identifiants

HAL Id : hal-00779291 , version 1
DOI : 10.1016/j.specom.2011.01.006

Citer

Ramón López-Cózar, Jan Silovsky, Martin Kroul. Enhancement of Emotion Detection in Spoken Dialogue Systems by Combining Several Information Sources. Speech Communication, 2011, 53 (9-10), pp.1210. ⟨10.1016/j.specom.2011.01.006⟩. ⟨hal-00779291⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

PEER

58 Consultations

138 Téléchargements

Enhancement of Emotion Detection in Spoken Dialogue Systems by Combining Several Information Sources

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager