End-to-End Automatic Speech Translation of Audiobooks

Alexandre Bérard 1, 2, 3 Laurent Besacier 2, 3, 1 Ali Can Kocabiyikoglu 1, 2 Olivier Pietquin 4, 5
5 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : We investigate end-to-end speech-to-text translation on a corpus of audiobooks specifically augmented for this task. Previous works investigated the extreme case where source language transcription is not available during learning nor decoding , but we also study a midway case where source language transcription is available at training time only. In this case, a single model is trained to decode source speech into target text in a single pass. Experimental results show that it is possible to train compact and efficient end-to-end speech translation models in this setup. We also distribute the corpus and hope that our speech translation baseline on this corpus will be challenged in the future.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [16 references]  Display  Hide  Download

Contributor : Laurent Besacier <>
Submitted on : Thursday, February 15, 2018 - 10:20:33 AM
Last modification on : Thursday, April 4, 2019 - 10:18:05 AM
Document(s) archivé(s) le : Sunday, May 6, 2018 - 4:30:10 AM


Files produced by the author(s)


  • HAL Id : hal-01709586, version 1


Alexandre Bérard, Laurent Besacier, Ali Can Kocabiyikoglu, Olivier Pietquin. End-to-End Automatic Speech Translation of Audiobooks. ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2018, Calgary, Alberta, Canada. ⟨hal-01709586⟩



Record views


Files downloads