Deep speech 2: End-to-end speech recognition in english and mandarin, 2015. ,
The pytorch-kaldi speech recognition toolkit, ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.6465-6469, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02107617
Espnet: End-to-end speech processing toolkit, 2018. ,
Jasper: An end-to-end convolutional neural acoustic model, 2019. ,
The kaldi speech recognition toolkit, IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, p.11, 2011. ,
Improving speech recognition by revising gated recurrent units, Proc. Interspeech, pp.1308-1312, 2017. ,
Towards end-to-end speech recognition with recurrent neural networks, International Conference on Machine Learning, pp.1764-1772, 2014. ,
Towards end-to-end speech recognition with deep convolutional neural networks, 2017. ,
Joint ctc-attention based end-to-end speech recognition using multi-task learning, 2017 IEEE international conference on acoustics, speech and signal processing ,
, IEEE, pp.4835-4839, 2017.
End-to-end phoneme sequence recognition using convolutional neural networks, 2013. ,
Acoustic modeling with deep neural networks using raw time signal for lvcsr, Fifteenth annual conference of the international speech communication association, pp.890-894, 2014. ,
Speech acoustic modeling from raw multichannel waveforms, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4624-4628, 2015. ,
End-to-end speech recognition from the raw waveform, pp.781-785, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01888739
Speaker recognition from raw waveform with sincnet, Proc. of Spoken Language Technology Workshop (SLT), pp.1021-1028, 2018. ,
Speech and speaker recognition from raw waveform with sincnet, 2018. ,
On learning interpretable cnns with parametric modulated kernel-based filters, Proc. of Interspeech. ISCA, pp.3480-3484, 2019. ,
Theory of communication, Journal of the Institute of Electrical Engineers, vol.93, pp.429-457, 1946. ,
Improving speech recognition with drop-in replacements for f-bank features, Proc. of SLSP, pp.210-222, 2019. ,
Learning filterbanks from raw speech for phoneme recognition, Proc. of ICASSP, pp.5509-5513, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01888737
Deep complex networks, Proc. of ICLR 2018, 2018. ,
Phase-aware speech enhancement with deep complex u-net, Proc. of ICLR 2019, 2019. ,
A Wavelet Tour of Signal Processing: The Sparse Way, 2008. ,
Geometric Computing with Clifford Algebras, 2001. ,
, Darpa timit acoustic-phonetic continous speech corpus cdrom. nist speech disc 1-1.1, NASA STI/Recon technical report, vol.93, 1993.
Rectified linear units improve restricted boltzmann machines, Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp.807-814, 2010. ,
Regularization of contextdependent deep neural networks with contextindependent multi-task training, Proc. of ICCASP, pp.4290-4294, 2015. ,