M2H-GAN: A GAN-based Mapping from Machine to Human Transcripts for Speech Understanding

Abstract : Deep learning is at the core of recent spoken language understanding (SLU) related tasks. More precisely, deep neu-ral networks (DNNs) drastically increased the performances of SLU systems, and numerous architectures have been proposed. In the real-life context of theme identification of telephone conversations , it is common to hold both a human, manual (TRS) and an automatically transcribed (ASR) versions of the conversations. Nonetheless, and due to production constraints, only the ASR transcripts are considered to build automatic classi-fiers. TRS transcripts are only used to measure the performances of ASR systems. Moreover, the recent performances in term of classification accuracy, obtained by DNN related systems are close to the performances reached by humans, and it becomes difficult to further increase the performances by only considering the ASR transcripts. This paper proposes to dis-tillates the TRS knowledge available during the training phase within the ASR representation, by using a new generative adver-sarial network called M2H-GAN to generate a TRS-like version of an ASR document, to improve the theme identification performances .
Document type :
Conference papers
Complete list of metadatas

Cited literature [30 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02107674
Contributor : Titouan Parcollet <>
Submitted on : Monday, June 17, 2019 - 6:13:36 PM
Last modification on : Friday, June 21, 2019 - 3:48:46 PM

File

INTERSPEECH_2019___GAN_for_SLU...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02107674, version 2

Collections

Citation

Titouan Parcollet, Mohamed Morchid, Xavier Bost, Georges Linarès. M2H-GAN: A GAN-based Mapping from Machine to Human Transcripts for Speech Understanding. INTERSPEECH 2019, Sep 2019, Gratz, Austria. ⟨hal-02107674v2⟩

Share

Metrics

Record views

18

Files downloads

10