Modeling ASR Ambiguity for Neural Dialogue State Tracking - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Modeling ASR Ambiguity for Neural Dialogue State Tracking

Résumé

Spoken dialogue systems typically use one or several (top-N) ASR sequence(s) for inferring the semantic meaning and tracking the state of the dialogue. However, ASR graphs, such as confusion networks (confnets), provide a compact representation of a richer hypothesis space than a top-N ASR list. In this paper, we study the benefits of using confusion networks with a neural dialogue state tracker (DST). We encode the 2-dimensional confnet into a 1-dimensional sequence of embed-dings using a confusion network encoder which can be used with any DST system. Our confnet encoder is plugged into the 'Global-locally Self-Attentive Dialogue State Tacker' (GLAD) model for DST and obtains significant improvements in both accuracy and inference time compared to using top-N ASR hypotheses .
Fichier principal
Vignette du fichier
1783anav.pdf (308.2 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02962177 , version 1 (09-10-2020)

Identifiants

  • HAL Id : hal-02962177 , version 1

Citer

Vaishali Pal, Fabien Guillot, Manish Shrivastava, Jean-Michel Renders, Laurent Besacier. Modeling ASR Ambiguity for Neural Dialogue State Tracking. Interspeech 2020, Oct 2020, Shangai (Virtual Conf), China. ⟨hal-02962177⟩
77 Consultations
124 Téléchargements

Partager

Gmail Facebook X LinkedIn More