Author-topic based representation of call-center conversations

Abstract : Performance of Automatic Speech Recognition (ASR) systems drops dramatically when transcribing conversations recorded in noisy conditions. Speech analytics suffer from this poor automatic transcription quality. To tackle this difficulty , a solution consists in mapping transcriptions into a space of hidden topics. This abstract representation allows to substantiate the drawbacks of the ASR process. The well-known and commonly used one is the topic-based representation from a Latent Dirichlet Allocation (LDA). Several studies demonstrate the effectiveness and reliability of this high-level representation. During the LDA learning process, distribution of words into each topic is estimated automatically. Nonetheless, in the context of a classification task, no consideration is made for the targeted classes. Thus, if the targeted application is to find out the main theme related to a dialogue, this information should be taken into consideration. In this paper, we propose to compare a classical topic-based representation of a dialogue, with a new one based not only on the dialogue content itself (words), but also on the theme related to the dialogue. This original representation is based on the author-topic (AT) model. The effectiveness of the proposed representation is evaluated on a classification task from automatic dialogue transcriptions between an agent and a customer of the Paris Transportation Company. Experiments confirmed that this author-topic model approach outperforms by far the classical topic representation, with a substantial gain of more than 7% in terms of correctly labeled conversations.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01318662
Contributor : Bibliothèque Universitaire Déposants Hal-Avignon <>
Submitted on : Thursday, May 19, 2016 - 5:34:31 PM
Last modification on : Saturday, March 23, 2019 - 1:22:39 AM

Identifiers

Collections

Citation

Mohamed Morchid, Richard Dufour, Mohamed Bouallegue, Georges Linarès. Author-topic based representation of call-center conversations. IEEE Spoken Language Technology Workshop (SLT) , Dec 2014, South Lake Tahoe United States. ⟨10.1109/SLT.2014.7078577⟩. ⟨hal-01318662⟩

Share

Metrics

Record views

120