L. Duong, A. Anastasopoulos, D. Chiang, S. Bird, and T. Cohn, An Attentional Model for Speech Translation Without Transcription, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016.
DOI : 10.18653/v1/N16-1109

A. Bérard, O. Pietquin, L. Besacier, and C. Servan, Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation, NIPS 2016 End-to-end Learning for Speech and Audio Processing Workshop, 2016.

R. J. Weiss-chorowski, N. Jaitly, Y. Wu, and Z. Chen, Sequence-to-Sequence Models Can Directly Transcribe Foreign Speech, Interspeech, 2017.

V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, Librispeech: An ASR corpus based on public domain audio books, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015.
DOI : 10.1109/ICASSP.2015.7178964

URL : http://www.clsp.jhu.edu/%7Eguoguo/papers/icassp2015_librispeech.pdf

L. Ali-can-kocabiyikoglu, O. Besacier, and . Kraif, Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation, LREC, 2018.

G. Adda, S. Stücker, M. Adda-decker, O. Ambouroue, L. Besacier et al., Breaking the Unwritten Language Barrier: The BULB Project, Proceedings of SLTU (Spoken Language Technologies for Under-Resourced Languages), 2016.
DOI : 10.1016/j.procs.2016.04.023

URL : https://hal.archives-ouvertes.fr/halshs-01428027

A. Anastasopoulos and D. Chiang, A case study on using speech-to-translation alignments for language documentation, Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages, 2017.
DOI : 10.18653/v1/W17-0123

M. Post, G. Kumar, A. Lopez, D. Karakos, C. Callison-burch et al., Improved Speech-to-Text Translation with the Fisher and Callhome Spanish-English Speech Translation Corpus, IWSLT, 2013.

D. Bahdanau, K. Cho, and Y. Bengio, Neural Machine Translation by Jointly Learning to Align and Translate, ICLR, 2015.

J. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Chorowski, Attention-Based Models for Speech Recognition End-to-End Attention-based Large Vocabulary Speech Recognition, NIPS, 2015.

R. Sennrich, O. Firat, K. Cho, A. Birch, B. Haddow et al., Nematus: a Toolkit for Neural Machine Translation, Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017.
DOI : 10.18653/v1/E17-3017

URL : https://doi.org/10.18653/v1/e17-3017

B. Mathieu, S. Essid, T. Fillon, J. Prado, and G. Richard, YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software, ISMIR (International Society of Music Information Retrieval), 2010.

R. Sennrich, B. Haddow, and A. Birch, Neural Machine Translation of Rare Words with Subword Units, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016.
DOI : 10.18653/v1/P16-1162

D. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, ICLR, 2015.

P. Diederik, T. Kingma, M. Salimans, and . Welling, Variational dropout and the local reparameterization trick, NIPS, 2015.

W. Zaremba, I. Sutskever, and O. Vinyals, Recurrent Neural Network Regularization, ICLR, 2014.