Skip to Main content Skip to Navigation
Conference papers

End2End Acoustic to Semantic Transduction

Abstract : In this paper, we propose a novel end-to-end sequence-to-sequence spoken language understanding model using an attention mechanism. It reliably selects contextual acoustic features in order to hypothesize semantic contents. An initial architecture capable of extracting all pronounced words and concepts from acoustic spans is designed and tested. With a shallow fusion language model, this system reaches a 13.6 concept error rate (CER) and an 18.5 concept value error rate (CVER) on the French MEDIA corpus, achieving an absolute 2.8 points reduction compared to the state-of-the-art. Then, an original model is proposed for hypothesizing concepts and their values. This transduction reaches a 15.4 CER and a 21.6 CVER without any new type of context.
Complete list of metadata
Contributor : Valentin Pelloin Connect in order to contact the contributor
Submitted on : Tuesday, February 2, 2021 - 9:36:59 AM
Last modification on : Tuesday, May 18, 2021 - 4:49:23 PM

Links full text




Valentin Pelloin, Nathalie Camelin, Antoine Laurent, Renato de Mori, Antoine Caubrière, et al.. End2End Acoustic to Semantic Transduction. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jun 2021, Toronto, ON, Canada. ⟨10.1109/ICASSP39728.2021.9413581⟩. ⟨hal-03128163⟩



Les métriques sont temporairement indisponibles