Skip to Main content Skip to Navigation
Conference papers

Towards End-to-End spoken intent recognition in smart home

Abstract : Voice based interaction in a smart home has become a feature of many industrial products. These systems react to voice commands, whether it is for answering a question, providing music or turning on the lights. To be efficient, these systems must be able to extract the intent of the user from the voice command. Intent recognition from voice is typically performed through automatic speech recognition (ASR) and intent classification from the transcriptions in a pipeline. However, the errors accumulated at the ASR stage might severely impact the intent classifier. In this paper, we propose an End-to-End (E2E) model to perform intent classification directly from the raw speech input. The E2E approach is thus optimized for this specific task and avoids error propagation. Furthermore, prosodic aspects of the speech signal can be exploited by the E2E model for intent classification (e.g., question vs imperative voice). Experiments on a corpus of voice commands acquired in a real smart home reveal that the state-of-the art pipeline baseline is still superior to the E2E approach. However, using artificial data generation techniques we show that significant improvement to the E2E model can be brought to reach competitive performances. This opens the way to further research on E2E Spoken Language Understanding.
Complete list of metadatas

Cited literature [33 references]  Display  Hide  Download
Contributor : Michel Vacher <>
Submitted on : Thursday, October 17, 2019 - 9:50:07 AM
Last modification on : Tuesday, December 8, 2020 - 10:31:13 AM
Long-term archiving on: : Saturday, January 18, 2020 - 12:30:37 PM


2019_SPED_Desot_final (2).pdf
Files produced by the author(s)


  • HAL Id : hal-02316743, version 1



Thierry Desot, François Portet, Michel Vacher. Towards End-to-End spoken intent recognition in smart home. SpeD 2019 – The 10th Conference on Speech Technology and Human Computer Dialogue, Oct 2019, Timisoara, Romania. pp.1-8. ⟨hal-02316743⟩



Record views


Files downloads