Listen and translate: A proof of concept for end-toend speech-to-text translation, NIPS Workshop on End-to-end Learning for Speech and Audio Processing, 2016. ,
Sequence-to-sequence models can directly transcribe foreign speech, 2017. ,
End-to-End Automatic Speech Translation of Audiobooks, 2018. ,
, -IEEE International Conference on Acoustics, Speech and Signal Processing, 2018.
Pre-training on high-resource speech recognition improves low-resource speech-totext translation, CoRR, 2018. ,
Towards unsupervised speech-to-text translation, CoRR, 2018. ,
Direct speechto-speech translation with a sequence-to-sequence model, CoRR, 2019. ,
Leveraging weakly supervised data to improve end-to-end speechto-text translation, CoRR, 2018. ,
Attention-passing models for robust and dataefficient end-to-end speech translation, CoRR, 2019. ,
How2: a largescale dataset for multimodal language understanding, 2018. ,
Must-c: a multilingual speech translation corpus, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol.1, pp.2012-2017, 2019. ,
Lium spkdiarization: An open source toolkit for diarization, CMU SPUD Workshop, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-01433518
The kaldi speech recognition toolkit, IEEE 2011 workshop on automatic speech recognition and understanding, 2011. ,
Ted-lium 3: twice as much data and corpus repartition for experiments on speaker adaptation, International Conference on Speech and Computer, pp.198-208, 2018. ,
Espnet: End-to-end speech processing toolkit, 2018. ,
A pitch extraction algorithm tuned for automatic speech recognition, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2494-2498, 2014. ,
Audio augmentation for speech recognition, Sixteenth Annual Conference of the International Speech Communication Association, 2015. ,
Very deep convolutional networks for large-scale image recognition, 2014. ,
Neural Machine Translation by Jointly Learning to Align and Translate, ICLR 2015, pp.3104-3112, 2015. ,
The kaldi speech recognition toolkit, IEEE 2011 workshop, 2011. ,
Srilm -an extensible language modeling toolkit, PROCEEDINGS OF THE 7TH IN-TERNATIONAL CONFERENCE ON SPOKEN LAN-GUAGE PROCESSING, pp.901-904, 2002. ,
Attention is all you need, Advances in Neural Information Processing Systems, vol.30, pp.5998-6008, 2017. ,
fairseq: A fast, extensible toolkit for sequence modeling, Proceedings of NAACL-HLT 2019: Demonstrations, 2019. ,
Neural machine translation of rare words with subword units, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol.1, pp.1715-1725, 2016. ,