Multistage speaker diarization of broadcast news, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.5, pp.1505-1512, 2006. ,
DOI : 10.1109/TASL.2006.878261
URL : https://hal.archives-ouvertes.fr/hal-01434241
Semisupervised Learning with Constraints for Person Identification in Multimedia Data, International Conference on Computer Vision and Pattern Recognition (CVPR), 2013. ,
Random Search for Hyper- Parameter Optimization, J. Mach. Learn. Res, vol.13, pp.281-305, 2012. ,
Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment, vol.2008, issue.10, 2008. ,
DOI : 10.1088/1742-5468/2008/10/P10008
URL : https://hal.archives-ouvertes.fr/hal-01146070
Audio-Visual Speech Synchrony Measure: Application to, Special Issue on Knowledge-Assisted Media Analysis for Interactive Multimedia Applications, 2007. ,
DOI : 10.1155/2007/70186
URL : https://doi.org/10.1155/2007/70186
Integer Linear Programming for Speaker Diarization and Cross-Modal Identification in TV Broadcast, Interspeech 2013, 14th Annual Conference of the International Speech Communication Association, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00953095
A comparative study using manual and automatic transcriptions for diarization, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005., pp.415-419, 2005. ,
DOI : 10.1109/ASRU.2005.1566507
URL : https://www.lrde.epita.fr/~reda/cours/speech/speakerDiarization/1566507.pdf
Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion. In: DARPA Broadcast News Transcription and Understanding Workshop, 1998. ,
Talking pictures: Temporal grouping and dialog-supervised person recognition, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010. ,
DOI : 10.1109/CVPR.2010.5540106
URL : http://www.seas.upenn.edu/%7Etimothee/papers/cvpr_2010.pdf
Applications of video-content analysis and retrieval, IEEE Multimedia, vol.9, issue.3, pp.42-55, 2002. ,
DOI : 10.1109/MMUL.2002.1022858
Models Cascade for Tree- Structured Named Entity Detection Asian Federation of Natural Language Processing, Proceedings of 5th International Joint Conference on Natural Language Processing, pp.1269-1278, 2011. ,
i- Vectors and ILP Clustering Adapted to Cross-Show Speaker Diarization, Interspeech 2012, 13th Annual Conference of the International Speech Communication Association, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-01450711
Extracting true speaker identities from transcriptions, Proceedings of Interspeech, pp.2601-2604, 2007. ,
Enforcing transitivity in coreference resolution, Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies Short Papers, HLT '08, 2008. ,
DOI : 10.3115/1557690.1557703
URL : http://nlp.stanford.edu/cmanning/papers/acl08_coref_ilp_final.pdf
Results of the Fall 2004 STT and MDE Evaluation, Rich Transcription Workshop, 2004. ,
Partitioning and Transcription of Broadcast News Data, Proceedings of International Conference on Spoken Language Processing (ICSLP 98), pp.1335-1338, 1998. ,
The LIMSI Broadcast News transcription system, Speech Communication, vol.37, issue.1-2, pp.89-109, 2002. ,
DOI : 10.1016/S0167-6393(01)00061-9
URL : https://hal.archives-ouvertes.fr/hal-01434493
Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE Transactions on Speech and Audio Processing, vol.2, issue.2, pp.291-298, 1994. ,
DOI : 10.1109/89.279278
The REPERE Corpus: a Multimodal Corpus for Person Recognition, International Conference on Language Resources and Evaluation (LREC), 2012. ,
The ETAPE Corpus for the Evaluation of Speech-based TV Content processing in the French language, International Conference on Language Resources , Evaluation and Corpora. Turkey, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00712591
Gurobi Optimizer Reference Manual, 2012. ,
Perceptual linear predictive (PLP) analysis of speech, The Journal of the Acoustical Society of America, vol.87, issue.4, pp.1738-1752, 1990. ,
DOI : 10.1121/1.399423
Data clustering: a review, ACM Computing Surveys, vol.31, issue.3, pp.264-323, 1999. ,
DOI : 10.1145/331499.331504
Automatic named identification of speakers using diarization and ASR systems, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ,
DOI : 10.1109/ICASSP.2009.4960644
URL : https://hal.archives-ouvertes.fr/hal-00412431
A Scalable Video Search Engine Based on Audio Content Indexing and Topic Segmentation, Networked and Electronic Media (NEM) Summit : Implementing Future Media Internet, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00645228
On the use of GSV-SVM for Speaker Diarization and Tracking, Proceedings of Odyssey 2010 -The Speaker and Language Recognition Workshop, pp.146-150, 2010. ,
Speaker Diarization: About whom the Speaker is Talking ?, 2006 IEEE Odyssey, The Speaker and Language Recognition Workshop, 2006. ,
DOI : 10.1109/ODYSSEY.2006.248114
URL : https://hal.archives-ouvertes.fr/hal-01434121
On a Strategy for Spectral Clustering with Parallel Computation . High Performance Computing for Computational Science?VECPAR, pp.408-420, 2010. ,
Modularity and community structure in networks, Proceedings of the National Academy of Sciences, vol.68, issue.6804, pp.8577-8582, 2006. ,
DOI : 10.1073/pnas.021544898
URL : http://www.pnas.org/content/103/23/8577.full.pdf
MMSS: Multi-modal Story-oriented Video Summarization, Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM), 2004. ,
Automatic multimedia cross-modal correlation discovery, Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '04, 2004. ,
DOI : 10.1145/1014052.1014135
URL : http://www.cs.bilkent.edu.tr/%7Eduygulu/papers/KDD2004.pdf
Feature Warping for Robust Speaker Verification, Proceedings of Odyssey 2001 - The Speaker Recognition Workshop, pp.213-218, 2001. ,
X-means: Extending K-means with Efficient Estimation of the Number of Clusters, Proceedings of the Seventeenth International Conference on Machine Learning, ICML '00, pp.727-734 ,
Unsupervised Naming of Speakers in Broadcast TV: using Written Names, Pronounced Names or Both? In: Interspeech 2013, 14th Annual Conference of the International Speech Communication Association, 2013. ,
From Text Detection in Videos to Person Identification, 2012 IEEE International Conference on Multimedia and Expo, 2012. ,
DOI : 10.1109/ICME.2012.119
URL : https://hal.archives-ouvertes.fr/hal-00767383
Unsupervised Speaker Identification using Overlaid Texts in TV Broadcast, Interspeech 2012, 13th Annual Conference of the International Speech Communication Association. Portland, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00767427
Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing, vol.10, issue.1-3, pp.1-3, 2000. ,
DOI : 10.1006/dspr.1999.0361
URL : http://www.cse.ohio-state.edu/~dwang/teaching/cse788/papers/Reynolds-dsp00.pdf
Content-based image retrieval at the end of the early years, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.12, pp.1349-1380, 2000. ,
DOI : 10.1109/34.895972
An Overview of the Tesseract OCR Engine, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, pp.629-633, 2007. ,
DOI : 10.1109/ICDAR.2007.4376991
Who Really Spoke When? Finding Speaker Turns and Identities in Broadcast News Audio, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, pp.1013-1016, 2006. ,
DOI : 10.1109/ICASSP.2006.1660195
URL : http://mi.eng.cam.ac.uk/reports/svr-ftp/tranter_icassp06.pdf
Multimedia content analysis-using both audio and visual clues, IEEE Signal Processing Magazine, vol.17, issue.6, pp.12-36, 2000. ,
DOI : 10.1109/79.888862