T. Choudhury, B. Clarkson, T. Jebara, and A. Pentl, Multimodal person recognition using unconstrained audio and video, AVBPA, pp.291-300, 1999.

A. Albiol, L. Torrest, and E. J. Delpt, The indexing of persons in news sequences using audio-visual data, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)., pp.6-10, 2003.
DOI : 10.1109/ICASSP.2003.1199126

R. Houghton, Named Faces: putting names to faces, IEEE Intelligent Systems, vol.14, issue.5, pp.45-50, 1999.
DOI : 10.1109/5254.796089

M. Chen and A. Hauptmann, Searching for a specific person in broadcast news video, ICASSP, pp.1036-1039, 2004.

S. Satoh and T. Kanade, Name-It: association of face and name in video, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1996.
DOI : 10.1109/CVPR.1997.609351

V. B. Le, C. Barras, and M. Ferras, On the use of GSV- SVM for speaker diarization and tracking, Odyssey, pp.146-161, 2010.

R. Lienhart and W. Effelsberg, Automatic text segmentation and text recognition for video indexing, Multimedia Systems, vol.8, issue.1, pp.69-81, 1998.
DOI : 10.1007/s005300050006

R. Minetto, N. Thome, M. Cord, J. Fabrizio, and B. Marcotegui, Snoopertext: A multiresolution system for text detection in complex visual scenes, 2010 IEEE International Conference on Image Processing, pp.3861-3864, 2010.
DOI : 10.1109/ICIP.2010.5651761

URL : https://hal.archives-ouvertes.fr/hal-00834466

C. Wolf, J. Jolion, and F. Chassaing, Text localization, enhancement and binarization in multimedia documents, Object recognition supported by user interaction for service robots, pp.1037-1040, 2002.
DOI : 10.1109/ICPR.2002.1048482

X. Hua, X. R. Chen, L. Wenyin, and H. Zhang, Automatic location of text in video frames, Proceedings of the 2001 ACM workshops on Multimedia multimedia information retrieval, MULTIMEDIA '01, pp.24-27, 2001.
DOI : 10.1145/500933.500941

M. Cai, J. Song, and M. R. Lyu, A new approach for video text detection, Image Proc, pp.117-120, 2002.

Q. Ye and Q. Huang, A New Text Detection Algorithm in Images/Video Frames, Advances in Multimedia Information Proc. -PCM, pp.858-865, 2005.
DOI : 10.1007/978-3-540-30542-2_106

C. Jung, Q. Liu, and J. Kim, A stroke filter and its application to text localization, Pattern Recognition Letters, vol.30, issue.2, pp.114-122, 2009.
DOI : 10.1016/j.patrec.2008.05.014

M. Anthimopoulos, B. Gatos, and I. Pratikakis, A two-stage scheme for text detection in video images, Image and Vision Computing, vol.28, issue.9, pp.1413-1426, 2010.
DOI : 10.1016/j.imavis.2010.03.004

F. Einsele, R. Ingold, and J. Hennebert, A HMM-Based Approach to Recognize Ultra Low Resolution Anti-Aliased Words, PReMI'07, pp.511-518, 2007.
DOI : 10.1007/978-3-540-77046-6_63

. Zhang, A video text detection and recognition system, p.222, 2001.

J. Sauvola and M. Pietikinen, Adaptive document image binarization, Pattern Recognition, vol.33, issue.2, pp.225-236, 2000.
DOI : 10.1016/S0031-3203(99)00055-2

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.98.880

G. Quénot, D. Moraru, and L. Besacier, CLIPS at TRECvid: Shot boundary detection and feature detection, Proceedings of TRECVID, pp.35-40, 2003.

B. Gatos, K. Ntirogiannis, and I. Pratikakis, DIBCO 2009: document image binarization contest, International Journal on Document Analysis and Recognition (IJDAR), vol.43, issue.6, pp.35-44, 2011.
DOI : 10.1007/s10032-010-0115-7