A. Asadi, R. Schwartz, and J. Makhoul, Automatic detection of new words in a large vocabulary continuous speech recognition system, Proc. of International Conference on Acoustics, Speech and Signal Processing, 1990.

S. R. Young, Recognition confidence measures: Detection of misrecognitions and out-of-vocabulary words, Proc. of International Conference on Acoustics, Speech and Signal Processing, pp.21-24, 1994.
DOI : 10.21236/ADA281254

B. Lecouteux, G. Linarès, and B. Favre, Combined low level and high level features for out-of-vocabulary word detection, 2009.
URL : https://hal.archives-ouvertes.fr/hal-01194283

M. Negri, M. Turchi, G. José, D. Souza, and . Falavigna, Quality estimation for automatic speech recognition, COLING, pp.1813-1823, 2014.

S. Jalalvand, M. Negri, M. Turchi, G. José, D. De-souza et al., TranscRater: a Tool for Automatic Speech Recognition Quality Estimation, Proceedings of ACL-2016 System Demonstrations, pp.43-48, 2016.
DOI : 10.18653/v1/P16-4008

J. Karol and . Piczak, Environmental sound classification with convolutional neural networks, Machine Learning for Signal Processing (MLSP), pp.2015-2040, 2015.

N. Tara, R. J. Sainath, A. Weiss, . Senior, W. Kevin et al., Learning the speech frontend with raw waveform cldnns, Sixteenth Annual Conference of the International Speech Communication Association, 2015.

M. Jin, Y. Song, I. Mcloughlin, L. Dai, and Z. Ye, LID-senone Extraction via Deep Neural Networks for End-to-End Language Identification, Odyssey 2016, 2016.
DOI : 10.21437/Odyssey.2016-30

D. Palaz, M. M. Doss, and R. Collobert, Convolutional Neural Networks-based continuous speech recognition using raw speech signal, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4295-4299, 2015.
DOI : 10.1109/ICASSP.2015.7178781

URL : http://ronan.collobert.com/pub/matos/2015_rawspeech_icassp.pdf

W. Dai, C. Dai, S. Qu, J. Li, and S. Das, Very deep convolutional neural networks for raw waveforms, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1610.
DOI : 10.1109/ICASSP.2017.7952190

URL : http://arxiv.org/pdf/1610.00087

G. Gravier, G. Adda, N. Paulson, M. Carré, A. Giraudel et al., The etape corpus for the evaluation of speech-based tv content processing in the french language, LREC- Eighth international conference on Language Resources and Evaluation, p.p. na, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00712591

S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri, J. Bonastre et al., The ester phase ii evaluation campaign for the rich transcription of french broadcast news, Interspeech, pp.1149-1152, 2005.

J. Kahn, O. Galibert, L. Quintard, M. Carré, A. Giraudel et al., A presentation of the REPERE challenge, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI), pp.1-6, 2012.
DOI : 10.1109/CBMI.2012.6269851

D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek et al., The kaldi speech recognition toolkit, IEEE 2011 workshop on automatic speech recognition and understanding, p.192584, 2011.

A. Stolcke, Srilm-an extensible language modeling toolkit, Interspeech, 2002.

M. De, C. , and G. Pérennou, Bdlex: a lexicon for spoken and written french, Proceedings of 1st International Conference on Langage Resources & Evaluation, pp.1129-1136, 1998.

O. Galibert, Methodologies for the evaluation of speaker diarization and automatic speech recognition in the presence of overlapping speech., " in INTER- SPEECH, pp.1131-1134, 2013.

H. Schmid, Treetagger? a language independent part-of-speech tagger, p.28, 1995.

F. Eyben, M. Wöllmer, and B. Schuller, Opensmile, Proceedings of the international conference on Multimedia, MM '10, pp.1459-1462, 2010.
DOI : 10.1145/1873951.1874246

F. Chollet, Keras, 2015.

Y. Kim, Convolutional Neural Networks for Sentence Classification, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.
DOI : 10.3115/v1/D14-1181

URL : http://arxiv.org/pdf/1408.5882

T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, Distributed representations of words and phrases and their compositionality, NIPS, 2013.

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu et al., Natural language processing (almost) from scratch, Journal of Machine Learning Research, vol.12, pp.2493-2537, 2011.

B. Mcfee, C. Raffel, D. Liang, P. Daniel, M. Ellis et al., librosa: Audio and music signal analysis in python, 2015.

D. Matthew and . Zeiler, ADADELTA: an adaptive learning rate method, 1212.