A. F. Smeaton, P. Over, and W. Kraaij, Evaluation campaigns and TRECVid, Proceedings of the 8th ACM international workshop on Multimedia information retrieval , MIR '06, pp.321-330, 2006.
DOI : 10.1145/1178677.1178722

G. Awad, A. Butt, J. Fiscus, D. Joy, A. Delgado et al.,

B. Jones and . Huet, Trecvid 2017: Evaluating ad-hoc and instance video search, events detection, video captioning and hyperlinking, Proceedings of TRECVID 2017, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01854790

G. Awad, W. Kraaij, P. Over, and S. Satoh, Instance search retrospective with focus on TRECVID, International Journal of Multimedia Information Retrieval, vol.17, issue.8, pp.1-29, 2017.
DOI : 10.1145/1132956.1132959

H. Bredin and G. Gelly, Improving Speaker Diarization of TV Series using Talking-Face Detection and Clustering, Proceedings of the 2016 ACM on Multimedia Conference, MM '16, 2016.
DOI : 10.1007/3-540-64594-2_94
URL : https://hal.archives-ouvertes.fr/hal-01836453

Y. Yusoff, W. Christmas, and J. Kittler, A study on automatic shot change detection, Multimedia Applications, Services and Techniques, pp.177-189, 1998.
DOI : 10.1007/3-540-64594-2_94

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.886-893, 2005.
DOI : 10.1109/CVPR.2005.177
URL : https://hal.archives-ouvertes.fr/inria-00548512

M. Danelljan, G. Häger, F. Shahbaz-khan, and M. Felsberg, Accurate Scale Estimation for Robust Visual Tracking, Proceedings of the British Machine Vision Conference 2014, 2014.
DOI : 10.5244/C.28.65

K. He, X. Zhang, S. Ren, and J. Sun, Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.90

D. E. King, Dlib-ml: A Machine Learning Toolkit, Journal of Machine Learning Research, vol.10, pp.1755-1758, 2009.

H. Ng and S. Winkler, A data-driven approach to cleaning large face datasets, 2014 IEEE International Conference on Image Processing (ICIP), pp.343-347, 2014.
DOI : 10.1109/ICIP.2014.7025068

O. M. Parkhi, A. Vedaldi, and A. Zisserman, Deep Face Recognition, Procedings of the British Machine Vision Conference 2015, 2015.
DOI : 10.5244/C.29.41

H. Bredin, pyannote-video: Face Detection, Tracking and Clustering in Videos

. Accessed, , pp.2016-2023

P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, pp.511-518, 2001.
DOI : 10.1109/CVPR.2001.990517

P. D. Vo, A. L. Ginsca, H. L. Borgne, and A. Popescu, Harnessing noisy Web images for deep representation, Computer Vision and Image Understanding, vol.164, 2017.
DOI : 10.1016/j.cviu.2017.01.009
URL : https://hal.archives-ouvertes.fr/cea-01756775

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 1409.

Y. Tamaazousti, H. Le-borgne, A. Popescu, E. Gadeski, A. L. Ginsca et al., Visionlanguage integration using constrained local semantic features, Computer Vision and Image Understanding, p.2017
URL : https://hal.archives-ouvertes.fr/cea-01803830

B. , Irim at trecvid 2016: Instance search, Proceedings of TRECVID 2016

K. Mikolajczyk and C. Schmid, Scale & Affine Invariant Interest Point Detectors, International Journal of Computer Vision, vol.60, issue.1, pp.63-86, 2004.
DOI : 10.1023/B:VISI.0000027790.02288.f2
URL : https://hal.archives-ouvertes.fr/inria-00548554

R. Arandjelovi´carandjelovi´c and A. Zisserman, Three things everyone should know to improve object retrieval, IEEE Conference on Computer Vision and Pattern Recognition, 2012.

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, Object retrieval with large vocabularies and fast spatial matching, 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007.
DOI : 10.1109/CVPR.2007.383172

M. J. Salton and . Mcgill, Introduction to modern information retrieval, 1986.

C. Zhu, H. Jegou, and S. I. Satoh, Query-Adaptive Asymmetrical Dissimilarities for Visual Object Retrieval, 2013 IEEE International Conference on Computer Vision, 2013.
DOI : 10.1109/ICCV.2013.214
URL : https://hal.archives-ouvertes.fr/hal-00872957

X. Zhou, C. Zhu, Q. Zhu, S. Satoh, and Y. ,

. Guo, A practical spatial re-ranking method for instance search from videos, Image Processing (ICIP), 2014 IEEE International Conference on, pp.3008-3012, 2014.

B. Zhou, A. Lapedriza, A. Khosla, A. Oliva, and A. Torralba, Places: A 10 Million Image Database for Scene Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.40, issue.6, 2017.
DOI : 10.1109/TPAMI.2017.2723009

B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, Learning Deep Features for Scene Recognition using Places Database, Advances in Neural Information Processing Systems

K. Q. Lawrence and . Weinberger, , pp.487-495, 2014.

M. Tapaswi, M. Bauml, and R. Stiefelhagen, StoryGraphs: Visualizing Character Interactions as a Timeline, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.827-834, 2014.
DOI : 10.1109/CVPR.2014.111