F. Bechet, M. Bendris, D. Charlet, G. Damnati, B. Favre et al., Multimodal Understanding for Person Recognition in Video Broadcasts, INTERSPEECH, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01194244

M. Bendris, B. Favre, D. Charlet, G. Damnati, R. Auguste et al., Unsupervised face identification in TV content using audio-visual sources, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI), 2013.
DOI : 10.1109/CBMI.2013.6576591
URL : https://hal.archives-ouvertes.fr/hal-00812334

G. Bernard, O. Galibert, and J. Kahn, The First Official REPERE Evaluation, SLAM-INTERSPEECH, 2013.

H. Bredin, A. Laurent, A. Sarkar, V. Le, S. Rosset et al., Person Instance Graphs for Named Speaker Identification in TV Broadcast, 2014.

H. Bredin and J. Poignant, Integer Linear Programming for Speaker Diarization and Cross-Modal Identification in TV Broadcast, INTERSPEECH, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00953095

A. Le, C. Sarkar, S. Barras, A. Rosset, Q. Roy et al., QCompere at REPERE 2013, SLAM-INTERSPEECH, 2013.

H. Bredin, A. Roy, V. Le, and C. Barras, Person instance graphs for mono-, cross- and multi-modal person recognition in multimedia data: application to speaker identification in TV broadcast, IJMIR, 2014.
DOI : 10.1109/79.888862
URL : https://hal.archives-ouvertes.fr/hal-01690350

L. Canseco, L. Lamel, and J. Gauvain, A comparative study using manual and automatic transcriptions for diarization, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005., 2005.
DOI : 10.1109/ASRU.2005.1566507
URL : https://www.lrde.epita.fr/~reda/cours/speech/speakerDiarization/1566507.pdf

L. Canseco-rodriguez, L. Lamel, and J. Gauvain, Speaker diarization from speech transcripts, INTERSPEECH, 2004.

S. Chen and P. Gopalakrishnan, Speaker, Environment And Channel Change Detection And Clustering Via The Bayesian Information Criterion, In DARPA Broadcast News Trans. and Under, 1998.

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005.
DOI : 10.1109/CVPR.2005.177
URL : https://hal.archives-ouvertes.fr/inria-00548512

M. Dinarelli and S. Rosset, Models Cascade for Tree-Structured Named Entity Detection, IJCNLP, 2011.

Y. Estève, S. Meignier, P. Deléglise, and J. Mauclair, Extracting true speaker identities from transcriptions, INTERSPEECH, 2007.

B. Favre, G. Damnati, F. Béchet, M. Bendris, D. Charlet et al., PERCOLI: a person identification system for the 2013 REPERE challenge, SLAM-INTERSPEECH, 2013.

P. Gay, G. Dupuy, C. Lailler, J. Odobez, S. Meignier et al., Comparison of two methods for unsupervised person identification in TV shows, 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI), 2014.
DOI : 10.1109/CBMI.2014.6849828
URL : https://hal.archives-ouvertes.fr/hal-01433260

A. Giraudel, M. Carré, V. Mapelli, J. Kahn, O. Galibert et al., The REPERE Corpus : a Multimodal Corpus for Person Recognition, LREC, 2012.

M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, Face Recognition from Caption-Based Supervision, International Journal of Computer Vision, vol.57, issue.2, p.2012
DOI : 10.1145/1027527.1027689
URL : https://hal.archives-ouvertes.fr/inria-00585834

R. Houghton, Named Faces: putting names to faces, IEEE Intelligent Systems, vol.14, issue.5, 1999.
DOI : 10.1109/5254.796089

V. Jousse, S. Petit-renaud, S. Meignier, Y. Estève, and C. Jacquin, Automatic named identification of speakers using diarization and ASR systems, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009.
DOI : 10.1109/ICASSP.2009.4960644
URL : https://hal.archives-ouvertes.fr/hal-00412431

J. Kahn, O. Galibert, L. Quintard, M. Carré, A. Giraudel et al., A presentation of the REPERE challenge, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI), 2012.
DOI : 10.1109/CBMI.2012.6269851

J. Mauclair, S. Meignier, and Y. Estève, Speaker Diarization: About whom the Speaker is Talking ?, 2006 IEEE Odyssey, The Speaker and Language Recognition Workshop, 2006.
DOI : 10.1109/ODYSSEY.2006.248114
URL : https://hal.archives-ouvertes.fr/hal-01434121

J. Poignant, L. Besacier, and G. Quénot, Unsupervised Speaker Identification in TV Broadcast Based on Written Names, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue.1, p.2015
DOI : 10.1109/TASLP.2014.2367822
URL : https://hal.archives-ouvertes.fr/hal-01060827

J. Poignant, L. Besacier, G. Quénot, and F. Thollard, From Text Detection in Videos to Person Identification, 2012 IEEE International Conference on Multimedia and Expo, 2012.
DOI : 10.1109/ICME.2012.119
URL : https://hal.archives-ouvertes.fr/hal-00767383

J. Poignant, H. Bredin, L. Besacier, G. Quénot, and C. Barras, Towards a better integration of written names for unsupervised speakers identification in videos, SLAM-INTERSPEECH, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00953089

J. Poignant, H. Bredin, V. Le, L. Besacier, C. Barras et al., Unsupervised speaker identification using overlaid texts in TV broadcast, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00767427

J. Poignant, G. Fortier, L. Besacier, and G. Quénot, Naming multi-modal clusters to identify persons in TV broadcast, Multimedia Tools and Applications, vol.6, issue.3, p.2015
DOI : 10.1145/1101149.1101155
URL : https://hal.archives-ouvertes.fr/hal-01230628

M. Rouvier, B. Favre, M. Bendris, D. Charlet, and G. Damnati, Scene understanding for identifying persons in TV shows: Beyond face authentication, 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI), 2014.
DOI : 10.1109/CBMI.2014.6849829
URL : https://hal.archives-ouvertes.fr/hal-01194242

S. Satoh, Y. Nakamura, and T. Kanade, Name-It: naming and detecting faces in news videos, IEEE Multimedia, vol.6, issue.1, 1999.
DOI : 10.1109/93.752960
URL : http://www.ri.cmu.edu/pub_files/pub2/satoh_s_1999_1/satoh_s_1999_1.pdf

M. U?i?á?, V. Franc, and V. Hlavá?, Facial Landmarks Detector Learned by the Structured Output SVM, VISAPP, 2012.
DOI : 10.1007/978-3-642-38241-3_26

J. Yang and A. G. Hauptmann, Naming every individual in news video monologues, Proceedings of the 12th annual ACM international conference on Multimedia , MULTIMEDIA '04, 2004.
DOI : 10.1145/1027527.1027666
URL : http://www.cs.cmu.edu/~juny/Prof/papers/acmmm04a-jyang.pdf

J. Yang, R. Yan, and A. G. Hauptmann, Multiple instance learning for labeling faces in broadcasting news video, Proceedings of the 13th annual ACM international conference on Multimedia , MULTIMEDIA '05, 2005.
DOI : 10.1145/1101149.1101155