Robust Articulatory Speech Synthesis using Deep Neural Networks for BCI Applications - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Robust Articulatory Speech Synthesis using Deep Neural Networks for BCI Applications

Résumé

Brain-Computer Interfaces (BCIs) usually propose typing strategies to restore communication for paralyzed and aphasic people. A more natural way would be to use speech BCI directly controlling a speech synthesizer. Toward this goal, a prerequisite is the development a synthesizer that should i) produce intelligible speech, ii) run in real time, iii) depend on as few parameters as possible, and iv) be robust to error fluctuations on the control parameters. In this context, we describe here an articulatory-to-acoustic mapping approach based on deep neural network (DNN) trained on electromagnetic articulography (EMA) data recorded synchronously with produced speech sounds. On this corpus, the DNN-based model provided a speech synthesis quality (as assessed by automatic speech recognition and behavioral testing) comparable to a state-of-the-art Gaussian mixture model (GMM), yet showing higher robustness when noise was added to the EMA coordinates. Moreover, to envision BCI applications, this robustness was also assessed when the space covered by the 12 original articulatory parameters was reduced to 7 parameters using deep auto-encoders (DAE). Given that this method can be implemented in real time, DNN-based articulatory speech synthesis seems a good candidate for speech BCI applications.
Fichier principal
Vignette du fichier
Bocquelet_Hueber_Badin_Girin_Yvert_SpeechSynthesisFromArticulationForBCI_Interpeech_2014.pdf (213.21 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01228891 , version 1 (18-11-2015)

Identifiants

  • HAL Id : hal-01228891 , version 1

Citer

Florent Bocquelet, Thomas Hueber, Laurent Girin, Pierre Badin, Blaise Yvert. Robust Articulatory Speech Synthesis using Deep Neural Networks for BCI Applications. Interspeech 2014 - 15th Annual Conference of the International Speech Communication Association, Sep 2014, Singapour, Singapore. ⟨hal-01228891⟩
544 Consultations
243 Téléchargements

Partager

Gmail Facebook X LinkedIn More