Author-topic based representation of call-center conversations - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Author-topic based representation of call-center conversations

Mohamed Morchid
Richard Dufour
Mohamed Bouallegue
  • Fonction : Auteur
  • PersonId : 772200
  • IdRef : 177675128
Georges Linarès

Résumé

Performance of Automatic Speech Recognition (ASR) systems drops dramatically when transcribing conversations recorded in noisy conditions. Speech analytics suffer from this poor automatic transcription quality. To tackle this difficulty , a solution consists in mapping transcriptions into a space of hidden topics. This abstract representation allows to substantiate the drawbacks of the ASR process. The well-known and commonly used one is the topic-based representation from a Latent Dirichlet Allocation (LDA). Several studies demonstrate the effectiveness and reliability of this high-level representation. During the LDA learning process, distribution of words into each topic is estimated automatically. Nonetheless, in the context of a classification task, no consideration is made for the targeted classes. Thus, if the targeted application is to find out the main theme related to a dialogue, this information should be taken into consideration. In this paper, we propose to compare a classical topic-based representation of a dialogue, with a new one based not only on the dialogue content itself (words), but also on the theme related to the dialogue. This original representation is based on the author-topic (AT) model. The effectiveness of the proposed representation is evaluated on a classification task from automatic dialogue transcriptions between an agent and a customer of the Paris Transportation Company. Experiments confirmed that this author-topic model approach outperforms by far the classical topic representation, with a substantial gain of more than 7% in terms of correctly labeled conversations.
Fichier non déposé

Dates et versions

hal-01318662 , version 1 (19-05-2016)

Identifiants

Citer

Mohamed Morchid, Richard Dufour, Mohamed Bouallegue, Georges Linarès. Author-topic based representation of call-center conversations. IEEE Spoken Language Technology Workshop (SLT) , Dec 2014, South Lake Tahoe United States. ⟨10.1109/SLT.2014.7078577⟩. ⟨hal-01318662⟩

Collections

UNIV-AVIGNON LIA
69 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More