P. Clément, T. Bazillon, and C. Fredouille, Speaker diarization of heterogeneous web video files: A preliminary study, Acoustics, Speech and Signal Processing (ICASSP), pp.4432-4435, 2011.

G. Friedland, H. Hung, and C. Yeo, Multi-modal speaker diarization of real-world meetings using compresseddomain video features, Acoustics, Speech and Signal Processing, pp.4069-4072, 2009.

J. Carletta, S. Ashby, S. Bourban, M. Flynn, M. Guillemot et al., The ami meeting corpus: A pre-announcement, Proceedings of the Second International Conference on Machine Learning for Multimodal Interaction, pp.28-39, 2006.

V. Tran, V. Le, C. Barras, and L. Lamel, Comparing multi-stage approaches for cross-show speaker diarization.," in INTERSPEECH, pp.1053-1056, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01690265

M. Bendris, D. Benoit-favre, G. Charlet, G. Damnati, R. Senay et al., Unsupervised face identification in tv content using audio-visual sources, Content-Based Multimedia Indexing (CBMI), pp.243-249, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00812334

H. Bredin, Segmentation of tv shows into scenes using speaker diarization and speech recognition, Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, pp.2377-2380, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01987818

I. Koprinska and S. Carrato, Temporal video segmentation: A survey, Signal processing: Image communication, vol.16, issue.5, pp.477-500, 2001.

N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, Front-end factor analysis for speaker verification, Audio, Speech, and Language Processing, vol.19, pp.788-798, 2011.

P. Bousquet, D. Matrouf, and J. Bonastre, Intersession compensation and scoring methods in the i-vectors space for speaker recognition, INTERSPEECH, pp.485-488, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01313266

. Peter-j-rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, Journal of computational and applied mathematics, vol.20, issue.3, pp.53-65, 1987.

S. John, . Boreczky, and . Lawrence-a-rowe, Comparison of video shot boundary detection techniques, Journal of Electronic Imaging, vol.5, issue.2, pp.122-128, 1996.

M. Rouvier, G. Dupuy, P. Gay, E. Khoury, T. Merlin et al., An open-source state-of-the-art toolbox for broadcast news diarization, INTERSPEECH, vol.5, p.6, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01433449

S. Bozonnet, W. D. Nicholas, C. Evans, and . Fredouille, The lia-eurecom rt'09 speaker diarization system: enhancements in speaker modelling and cluster purification, Acoustics Speech and Signal Processing (ICASSP), pp.4958-4961, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00601383

S. Meignier and T. Merlin, Lium spkdiarization: an open source toolkit for diarization, CMU SPUD Workshop, vol.2010, p.6, 2010.
URL : https://hal.archives-ouvertes.fr/hal-01433518