L. Bahl, P. Brown, P. De-souza, and R. Mercer, A tree-based statistical language model for natural language speech recognition, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.37, issue.7, pp.1001-1008, 2002.
DOI : 10.1109/29.32278

R. Casey and E. Lecolinet, A survey of methods and strategies in character segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.18, issue.7, pp.690-706, 2002.
DOI : 10.1109/34.506792

D. Chen, J. Odobez, and H. Bourlard, Text detection and recognition in images and video frames, Pattern Recognition, vol.37, issue.3, pp.595-608, 2004.
DOI : 10.1016/j.patcog.2003.06.001

T. Chen, D. Ghosh, and S. Ranganath, Video-text extraction and recognition, IEEE Region 10 Conference, pp.319-322, 2005.

A. Coates, B. Carpenter, C. Case, S. Satheesh, B. Suresh et al., Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning, 2011 International Conference on Document Analysis and Recognition, pp.440-445, 2011.
DOI : 10.1109/ICDAR.2011.95

M. Delakis and C. Garcia, Text detection with convolutional neural networks, International Conference on Computer Vision Theory and Applications, pp.290-294, 2008.

A. Dempster, N. Laird, and D. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series BMethodological), vol.39, issue.1, pp.1-38, 1977.

C. Dorai, H. Aradhye, and J. C. Shim, End-to-end video text recognition for multimedia content analysis, International Conference on Multimedia and Expo, pp.601-604, 2001.

E. Abed, H. Margner, and V. , Comparison of different preprocessing and feature extraction methods for offline recognition of handwritten Arabic words, International Conference on Document Analysis and Recognition, pp.974-978, 2007.

K. Elagouni, C. Garcia, F. Mamalet, and P. Sébillot, Combining Multi-scale Character Recognition and Linguistic Knowledge for Natural Scene Text OCR, 2012 10th IAPR International Workshop on Document Analysis Systems, pp.120-124, 2012.
DOI : 10.1109/DAS.2012.26

URL : https://hal.archives-ouvertes.fr/hal-00753908

K. Elagouni, C. Garcia, and P. Sébillot, A comprehensive neural-based approach for text recognition in videos using natural language processing, Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR '11, 2011.
DOI : 10.1145/1991996.1992019

URL : https://hal.archives-ouvertes.fr/hal-00645219

C. Garcia and M. Delakis, Convolutional face finder: a neural architecture for fast and robust face detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26, issue.11, pp.1408-1423, 2004.
DOI : 10.1109/TPAMI.2004.97

M. Halima, H. Karray, and A. Alimi, A Comprehensive Method for Arabic Video Text Detection, Localization, Extraction and Recognition, Advances in Multimedia Information Processing, pp.648-659, 2011.
DOI : 10.1007/978-3-642-15696-0_60

M. Hamdani, H. Abed, M. Kherallah, and A. Alimi, Combining Multiple HMMs Using On-line and Off-line Features for Off-line Arabic Handwriting Recognition, 2009 10th International Conference on Document Analysis and Recognition, pp.201-205, 2009.
DOI : 10.1109/ICDAR.2009.40

X. Hua, P. Yin, and H. Zhang, Efficient video text recognition using multiple frame integration, International Conference on Image Processing, pp.397-400, 2002.

K. Jung, I. Kim, K. Jain, and A. , Text information extraction in images and video: a survey, Pattern Recognition, vol.37, issue.5, pp.977-997, 2004.
DOI : 10.1016/j.patcog.2003.10.012

S. Kopf, T. Haenselmann, and W. Effelsberg, Robust character recognition in low-resolution images and videos, 2005.

Y. Kusachi, A. Suzuki, N. Ito, and K. Arakawa, Kanji recognition in scene images without detection of text fields-robust against variation of viewpoint, contrast, and background texture, International Conference on Pattern Recognition, pp.457-460, 2004.

Y. Lecun and Y. Bengio, Convolutional networks for images , speech, and time series. The handbook of brain theory and neural networks pp, pp.255-258, 1995.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol.86, issue.11, pp.2278-2324, 1998.
DOI : 10.1109/5.726791

H. Li, D. Doermann, and O. Kia, Automatic text detection and tracking in digital video, IEEE Transactions on Image Processing, vol.9, issue.1, pp.147-156, 2000.

M. Li, M. Bai, C. Wang, and B. Xiao, Conditional random field for text segmentation from images with complex background, Pattern Recognition Letters, vol.31, issue.14, pp.2295-2308, 2010.
DOI : 10.1016/j.patrec.2010.05.031

R. Lienhart and F. Stuber, Automatic text recognition in digital videos, Proc of SPIE Image and Video Processing IV, pp.180-188, 1996.

J. Lim, J. Park, and G. Medioni, Text segmentation in color images using tensor voting, Image and Vision Computing, vol.25, issue.5, pp.671-685, 2007.
DOI : 10.1016/j.imavis.2006.05.011

C. Mancas-thillou and B. Gosselin, Character Segmentation-by-Recognition Using Log-Gabor Filters, 18th International Conference on Pattern Recognition (ICPR'06)
DOI : 10.1109/ICPR.2006.362

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.73.1173

G. Miao, G. Zhu, S. Jiang, Q. Huang, X. Changsheng et al., A Real-Time Score Detection and Recognition Approach for Broadcast Basketball Video, Multimedia and Expo, 2007 IEEE International Conference on, pp.1691-1694, 2007.
DOI : 10.1109/ICME.2007.4284994

A. Mishra, K. Alahari, and C. Jawahar, An MRF Model for Binarization of Natural Scene Text, 2011 International Conference on Document Analysis and Recognition, pp.11-16, 2011.
DOI : 10.1109/ICDAR.2011.12

URL : https://hal.archives-ouvertes.fr/hal-00817972

K. Negishi, M. Iwamura, S. Omachi, and H. Aso, Isolated character recognition by searching features in scene images In: International Workshop on Camera-Based Document Analysis and Recognition, pp.140-147, 2005.

K. Ntirogiannis, B. Gatos, and I. Pratikakis, Binarization of Textual Content in Video Frames, 2011 International Conference on Document Analysis and Recognition, pp.673-677, 2011.
DOI : 10.1109/ICDAR.2011.141

T. Phan, P. Shivakumara, B. Su, and C. Tan, A Gradient Vector Flow-Based Method for Video Character Segmentation, 2011 International Conference on Document Analysis and Recognition, pp.1024-1028, 2011.
DOI : 10.1109/ICDAR.2011.207

Z. Sa¨?danesa¨?dane and C. Garcia, Automatic scene text recognition using a convolutional neural network, Conference on Computer Vision and Pattern Recognition, pp.100-106, 2007.

Z. Sa¨?danesa¨?dane and C. Garcia, Robust binarization for video text recognition, International Conference on Document Analysis and Recognition, pp.874-879, 2007.

Z. Sa¨?danesa¨?dane, C. Garcia, and J. Dugelay, The image text recognition graph (iTRG), International Conference on Multimedia and Expo, pp.266-269, 2009.

T. Sato, T. Kanade, E. Hughes, M. Smith, and S. Satoh, Video OCR: indexing digital news libraries by recognition of superimposed captions, Multimedia Systems, vol.7, issue.5, pp.385-395, 1999.
DOI : 10.1007/s005300050140

N. Sharma, U. Pal, and M. Blumenstein, Recent Advances in Video Based Document Processing: A Review, 2012 10th IAPR International Workshop on Document Analysis Systems, pp.63-68, 2012.
DOI : 10.1109/DAS.2012.72

P. Shivakumara, S. Bhowmick, B. Su, C. Tan, and U. Pal, A New Gradient Based Character Segmentation Method for Video Text Recognition, 2011 International Conference on Document Analysis and Recognition, pp.126-130, 2011.
DOI : 10.1109/ICDAR.2011.34

P. Y. Simard, D. Steinkraus, and J. C. Platt, Best practices for convolutional neural networks applied to visual document analysis, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings., pp.958-963, 2003.
DOI : 10.1109/ICDAR.2003.1227801

T. Som, D. Can, and M. Saraclar, HMM-based sliding video text recognition for Turkish broadcast news, 2009 24th International Symposium on Computer and Information Sciences, pp.475-479, 2009.
DOI : 10.1109/ISCIS.2009.5291877

A. Stolcke, SRILM-An extensible language modeling toolkit, International Conference on Spoken Language Processing, pp.901-904, 2002.

T. Wakahara and K. Kita, Binarization of Color Character Strings in Scene Images Using K-Means Clustering and Support Vector Machines, 2011 International Conference on Document Analysis and Recognition, pp.274-278, 2011.
DOI : 10.1109/ICDAR.2011.63

K. Wang and S. Belongie, Word Spotting in the Wild, European Conference on Computer Vision, pp.591-604, 2010.
DOI : 10.1007/978-3-642-15549-9_43

J. Weinman, E. Learned-miller, and A. Hanson, Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.10, pp.1733-1746, 2009.
DOI : 10.1109/TPAMI.2009.38

R. Yager, Connectives and quantifiers in fuzzy sets, Fuzzy Sets and Systems, vol.40, issue.1, pp.39-75, 1991.
DOI : 10.1016/0165-0114(91)90046-S

T. Yamazoe, M. Etoh, T. Yoshimura, and K. Tsujino, Hypothesis Preservation Approach to Scene Text Recognition with Weighted Finite-State Transducer, 2011 International Conference on Document Analysis and Recognition, pp.359-363, 2011.
DOI : 10.1109/ICDAR.2011.80

Q. Ye, Q. Huang, W. Gao, and D. Zhao, Fast and robust text detection in images and video frames, Image and Vision Computing, vol.23, issue.6, pp.565-576, 2005.
DOI : 10.1016/j.imavis.2005.01.004

J. Yi, Y. Peng, and J. Xiao, Using Multiple Frame Integration for the Text Recognition of Video, 2009 10th International Conference on Document Analysis and Recognition, pp.71-75, 2009.
DOI : 10.1109/ICDAR.2009.58

M. Yokobayashi and T. Wakahara, Segmentation and recognition of characters in scene images using selective binarization in color space and GAT correlation, Eighth International Conference on Document Analysis and Recognition (ICDAR'05), pp.167-171, 2005.
DOI : 10.1109/ICDAR.2005.208

D. Zhang and S. Chang, A Bayesian framework for fusing multiple word knowledge models in videotext recognition, Conference on Computer Vision and Pattern Recognition, pp.528-533, 2003.

Z. Zhou, L. Li, and C. Tan, Edge based binarization for video text images, International Conference on Pattern Recognition, pp.133-136, 2010.