An overview of automatic speaker diarization systems, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.5, pp.1557-1565, 2006. ,
DOI : 10.1109/TASL.2006.878256
Nuts and Flakes: a Study of Data Characteristics in Speaker Diarization, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, 2006. ,
DOI : 10.1109/ICASSP.2006.1660196
Robust Speaker Diarization for Meetings, 2006. ,
Computationally Efficient and Robust BIC-Based Speaker Segmentation, IEEE Transactions on Audio, Speech, and Language Processing, vol.16, issue.5, 2008. ,
DOI : 10.1109/TASL.2008.925152
URL : http://spiral.imperial.ac.uk/bitstream/10044/1/11710/2/IEEE_TRANS_ASLP_2008_Margarita_Kotti.pdf
Multi-stage Speaker Diarization for Conference and Lecture Meetings, " in Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR 2007 and RT Revised Selected Papers, pp.533-542, 2007. ,
Speaker diarization using autoassociative neural networks, Engineering Applications of Artificial Intelligence, vol.22, issue.4-5, 2009. ,
DOI : 10.1016/j.engappai.2009.01.012
Robust Speaker Diarization for Meetings: ICSI RT06S Meetings Evaluation System, Proc. ICSLP, 2006. ,
DOI : 10.1007/11965152_31
The ICSI RT07s Speaker Diarization System, " in Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR Revised Selected Papers, pp.509-519, 2007. ,
Fast Incremental Clustering of Gaussian Mixture Speaker Models for Scaling up Retrieval In On-Line Broadcast, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, 2006. ,
DOI : 10.1109/ICASSP.2006.1661327
URL : https://hal.archives-ouvertes.fr/hal-00448172
Speaker clustering of speech utterances using a voice characteristic reference space, Proc. ICSLP, 2004. ,
T-test distance and clustering criterion for speaker diarization, Proc. Interspeech, 2008. ,
The IIR-NTU Speaker Diarization Systems for RT, RT'09, NIST Rich Transcription Workshop, 2009. ,
E-HMM approach for learning and adapting sound models for speaker indexing, Proc. Odyssey Speaker and Language Recognition Workshop, pp.175-180, 2001. ,
URL : https://hal.archives-ouvertes.fr/hal-01434656
The LIA RT'07 speaker diarization system, " in Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR Revised Selected Papers, pp.520-532, 2007. ,
The LIA-EURECOM RT'09 Speaker Diarization System, RT'09, NIST Rich Transcription Workshop, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00601383
The lia-eurecom RT'09 speaker diarization system: Enhancements in speaker modelling and cluster purification, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010. ,
DOI : 10.1109/ICASSP.2010.5495088
URL : https://hal.archives-ouvertes.fr/hal-00601383
Agglomerative information bottleneck for speaker diarization of meetings data, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), pp.250-255, 2007. ,
DOI : 10.1109/ASRU.2007.4430119
Estimating normal means with a conjugate style dirichlet process prior, Communications in Statistics: Simulation and Computation, pp.727-741, 1994. ,
DOI : 10.1017/CBO9780511526237
Keeping the neural networks simple by minimizing the description length of the weights, Proceedings of the sixth annual conference on Computational learning theory , COLT '93, pp.5-13, 1993. ,
DOI : 10.1145/168304.168306
Variational inference in graphical models: The view from the marginal polytope, Forty-first Annual Allerton Conference on Communication, Control, and Computing, 2003. ,
Variational Bayesian Methods for Audio Indexing, 2005. ,
DOI : 10.1007/11677482_27
A study of new approaches to speaker diarization, Proc. Interspeech. ISCA, 2009. ,
Bayesian analysis of speaker diarization with eigenvoice priors, CRIM, 2008. ,
A novel speaker binary key derived from anchor models, Proc. Interspeech, 2010. ,
A fast-match approach for robust, faster than real-time speaker diarization, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), pp.693-698, 2007. ,
DOI : 10.1109/ASRU.2007.4430196
Parallelizing Speaker-Attributed Speech Recognition for Meeting Browsing, 2010 IEEE International Symposium on Multimedia, 2010. ,
DOI : 10.1109/ISM.2010.26
Friends and enemies: A novel initialization for speaker diarization, Proc. ICSLP, 2006. ,
A robust speaker clustering algorithm, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721), pp.411-416, 2003. ,
DOI : 10.1109/ASRU.2003.1318476
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.169.6147
Purity Algorithms for Speaker Diarization of Meetings Data, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, 2006. ,
DOI : 10.1109/ICASSP.2006.1660198
Speaker, environment and channel change detection and clustering via the bayesian information criterion, Proc. of DARPA Broadcast News Transcription and Understanding Workshop, pp.127-132, 1998. ,
Text-independent speaker identification, IEEE Signal Processing Magazine, pp.18-32, 1994. ,
DOI : 10.1109/79.317924
The ICSI meeting project: Resources and research, Proc. ICASSP Meeting Recognition Workshop, 2004. ,
The AMI meeting corpus, Proc. Measuring Behavior, 2005. ,
The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms, Language Resources and Evaluation, vol.41, issue.3-4, 2007. ,
DOI : 10.1007/s10579-007-9054-4
The NIST 2004 spring Rich Transcription evaluation: Two-axis merging strategy in the context of multiple distant microphone based meeting speaker segmentation, NIST 2004 Spring Rich Transcrition Evaluation Workshop, 2004. ,
Speaker segmentation and clustering in meetings, Proc. ICSLP, 2004. ,
NIST RT05S evaluation: Pre-processing techniques and speaker diarization on multiple microphone meetings, NIST 2005 Spring Rich Transcrition Evaluation Workshop, 2005. ,
DOI : 10.1007/11677482_36
URL : https://hal.archives-ouvertes.fr/hal-01434285
Robust Speaker Segmentation for Meetings: The ICSI-SRI Spring 2005 Diarization System, Proc. NIST MLMI Meeting Recognition Workshop, 2005. ,
DOI : 10.1007/11677482_34
Acoustic Beamforming for Speaker Diarization of Meetings, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.7, pp.2011-2023, 2007. ,
DOI : 10.1109/TASL.2007.902460
BeamformIt (the fast and robust acoustic beamformer) ,
Extrapolation, Interpolation, and Smoothing of Stationary Time Series, 1949. ,
Qualcomm-ICSI- OGI features for ASR, Proc. ICSLP, pp.4-7, 2002. ,
Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition, IEEE Transactions on Speech and Audio Processing, vol.12, issue.5, pp.489-498, 2004. ,
DOI : 10.1109/TSA.2004.832988
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.615.5654
An alternative approach to linearly constrained adaptive beamforming, IEEE Transactions on Antennas and Propagation, vol.30, issue.1, pp.27-34, 1982. ,
DOI : 10.1109/TAP.1982.1142739
Distant Speech Recognition, 2009. ,
Towards robust speaker segmentation: The ICSI-SRI fall 2004 diarization system, Rich Transcription Workshop, 2004. ,
Voice Activity Detection. Fundamentals and Speech Recognition System Robustness, Robust Speech Recognition and Understanding, p.460, 2007. ,
DOI : 10.5772/4740
Technical Improvements of the E-HMM Based Speaker Diarization System for Meeting Records, Proc. MLMI Third International Workshop, pp.359-370, 2006. ,
DOI : 10.1007/11965152_32
URL : https://hal.archives-ouvertes.fr/hal-01317165
Progress in the AMIDA Speaker Diarization System for Meeting Data, " in Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR 2007 and RT, pp.475-483, 2007. ,
The 2006 Athens Information Technology Speech Activity Detection and Speaker Diarization Systems, Machine Learning for Multimodal Interaction: Third International Workshop, pp.385-395, 2006. ,
DOI : 10.1007/11965152_34
Hybrid Speech/non-speech detector applied to Speaker Diarization of Meetings, 2006 IEEE Odyssey, The Speaker and Language Recognition Workshop, 2006. ,
DOI : 10.1109/ODYSSEY.2006.248109
Speaker diarization for meeting room audio, Proc. Interspeech'09, 2009. ,
Speaker diarization in meeting audio, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ,
DOI : 10.1109/ICASSP.2009.4960523
Improved speaker diarization system for meetings, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ,
DOI : 10.1109/ICASSP.2009.4960529
URL : https://hal.archives-ouvertes.fr/hal-01433912
Content analysis for audio classification and segmentation, IEEE Transactions on Speech and Audio Processing, vol.10, issue.7, pp.504-516, 2002. ,
DOI : 10.1109/TSA.2002.804546
Improving speaker segmentation via speaker identification and text segmentation, Proc. Interspeech, pp.3073-3076, 2009. ,
Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted gmms, Proc. ICSLP, 2004. ,
The AMI Speaker Diarization System for NIST RT06s Meeting Data, Machine Learning for Multimodal Interaction ,
DOI : 10.1007/11965152_33
The cost278 paneuropean broadcast news database, Proc. LREC European Language Resources Association (ELRA), pp.873-876, 2004. ,
Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), pp.413-416, 2001. ,
DOI : 10.1109/ICASSP.2001.940855
Robust Speaker Change Detection, IEEE Signal Processing Letters, vol.11, issue.8, pp.649-651, 2004. ,
DOI : 10.1109/LSP.2004.831666
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.139.971
Real-time unsupervised speaker change detection, Object recognition supported by user interaction for service robots, pp.358-361, 2002. ,
DOI : 10.1109/ICPR.2002.1048313
Evolutive speaker segmentation using a repository system, Proc. Interspeech, 2004. ,
Speaker diarization for multi-party meetings using acoustic fusion, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005., pp.426-431, 2005. ,
DOI : 10.1109/ASRU.2005.1566478
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.452.6901
Unsupervised speaker change detection using probabilistic pattern matching, IEEE Signal Processing Letters, vol.13, issue.8, pp.509-512, 2006. ,
DOI : 10.1109/LSP.2006.873656
URL : http://uhra.herts.ac.uk/bitstream/2299/110/1/103570.pdf
Segregation of speakers for speech recognition and speaker identification, Proc. ICASSP 91, 1991. ,
DISTBIC: A speaker-based segmentation for audio data indexing, Speech Communication, vol.32, issue.1-2, pp.111-126, 2000. ,
DOI : 10.1016/S0167-6393(00)00027-3
Agglomerative hierarchical speaker clustering using incremental gaussian mixture cluster modeling, Proc. Interspeech'08, pp.20-23, 2008. ,
A novel method for two speaker segmentation, Proc. ICSLP, 2004. ,
Fast speaker change detection for broadcast news transcription and indexing, Proc. EuroSpeech-99, pp.1031-1034, 1999. ,
Automatic segmentation , classification and clustering of broadcast news audio, Proc. DARPA Speech Recognition Workshop, pp.97-99, 1997. ,
Modified DISTBIC algorithm for speaker change detection, Proc. 9th Eur. Conf, pp.3073-3076, 2005. ,
Speaker Diarization: From Broadcast News to Lectures, Proc. MLMI, pp.396-406, 2006. ,
DOI : 10.1007/11965152_35
Novel inter-cluster distance measure combining GLR and ICR for improved agglomerative hierarchical speaker clustering, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4373-4376, 2008. ,
DOI : 10.1109/ICASSP.2008.4518624
Experiments on speaker tracking and segmentation in radio broadcast news, Proc. ICSLP, 2005. ,
Improving speaker diarisation, Proc. DARPA RT04, 2004. ,
Trainable speaker diarization, Proc. Interspeech, pp.1861-1865, 2007. ,
Towards Audio-Visual On-line Diarization Of Participants In Group Meetings, " in Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications -M2SFA2, 2008. ,
Live speaker identification in conversations, Proceeding of the 16th ACM international conference on Multimedia, MM '08, pp.1017-1018, 2008. ,
DOI : 10.1145/1459359.1459558
Prosodic and other Long-Term Features for Speaker Diarization, IEEE Transactions on Audio, Speech, and Language Processing, vol.17, issue.5, pp.985-993, 2009. ,
DOI : 10.1109/TASL.2009.2015089
Speaker diarization for conference room: The UPC RT07s evaluation system, " in Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR 2007 and RT Revised Selected Papers, pp.543-553, 2007. ,
Speaker Diarization for Multiple Distant Microphone Meetings: Mixing Acoustic Features And Inter-Channel Time Differences, Proceedings of Interspeech, 2006. ,
Location based speaker segmentation, Proc. ICASSP, pp.176-179, 2003. ,
DOI : 10.1109/icme.2003.1221388
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.8724
Speaker turn detection based on betweenchannels differences, Proc. ICASSP, 2004. ,
Clustering and segmenting speakers and their locations in meetings, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.605-613, 2004. ,
DOI : 10.1109/ICASSP.2004.1326058
Speaker Diarization for Multiple Distant Microphone Meetings: Mixing Acoustic Features And Inter-Channel Time Differences, Proc. Interspeech, 2006. ,
DOI : 10.1007/11965152_23
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.80.9616
Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information, IEEE Transactions on Computers, vol.56, issue.9, pp.1212-1224, 2007. ,
DOI : 10.1109/TC.2007.1077
Speaker diarization using unsupervised discriminant analysis of inter-channel delay features, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4061-4064, 2009. ,
DOI : 10.1109/ICASSP.2009.4960520
URL : https://hal.archives-ouvertes.fr/hal-01318388
Speaker Identification using Warped MVDR Cepstral Features, Proc. of Interspeech, 2009. ,
Higher-Level Features in Speaker Recognition, " in Speaker Classification I, ser, Lecture Notes in Artificial Intelligence, vol.4343, 2007. ,
Tuning-Robust Initialization Methods for Speaker Diarization, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.8, 2010. ,
DOI : 10.1109/TASL.2010.2040796
URL : http://infoscience.epfl.ch/record/153578
Observations on overlap: Findings and implications for automatic processing of multi-party conversations, Proc. Eurospeech, pp.1359-1362, 2001. ,
Speaker overlaps and ASR errors in meetings: Effects before, during, and after the overlap, Proc. ICASSP, pp.357-360, 2006. ,
Overlapped speech detection for improved speaker diarization in multiparty meetings, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4353-4356, 2008. ,
DOI : 10.1109/ICASSP.2008.4518619
Handling overlapped speech in speaker diarization, 2008. ,
Audio Segmentation for Meetings Speech Processing, 2008. ,
Efficient use of overlap information in speaker diarization, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), pp.686-692, 2007. ,
DOI : 10.1109/ASRU.2007.4430194
Robust speech recognition using the modulation spectrogram, Speech Communication, vol.25, issue.1-3, pp.117-132, 1998. ,
DOI : 10.1016/S0167-6393(98)00032-6
Speaker Localisation Using Audio-Visual Synchrony: An Empirical Study, Lecture Notes in Computer Science, vol.2728, pp.565-570, 2003. ,
DOI : 10.1007/3-540-45113-7_48
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.68.8423
Boosting-Based Multimodal Speaker Detection for Distributed Meetings, 2006 IEEE Workshop on Multimedia Signal Processing, 2006. ,
DOI : 10.1109/MMSP.2006.285274
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.136.8526
On-line multi-modal speaker diarization, Proceedings of the ninth international conference on Multimodal interfaces , ICMI '07, pp.350-357, 2007. ,
DOI : 10.1145/1322192.1322254
Factorial hidden markov models, Machine Learning, vol.29, issue.2/3, pp.245-273, 1997. ,
DOI : 10.1023/A:1007425814087
Mutimodal speaker diarization, Computer Vision and Image Understanding, 2009. ,
DOI : 10.1109/tpami.2011.47
Multi-Modal Speech Recognition Using Optical-Flow Analysis for Lip Images, Real World Speech Processing, 2004. ,
Cross-modal Prediction in Audio-visual Communication, Proc. ICASSP, pp.2056-2059, 1996. ,
Learning joint statistical models for audio-visual fusion and segregation, Proc. NIPS, pp.772-778, 2000. ,
Speaker Association With Signal-Level Audiovisual Fusion, IEEE Transactions on Multimedia, vol.6, issue.3, pp.406-413, 2004. ,
DOI : 10.1109/TMM.2004.827503
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.131.3704
Exploiting audio-visual correlation in coding of talking head sequences, International Picture Coding Symposium, 1996. ,
Dynamic Dependency Tests for Audio-Visual Speaker Association, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07, 2007. ,
DOI : 10.1109/ICASSP.2007.366271
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143.6887
CUAVE: A new audio-visual database for multimodal human-computer interface research, Proc. ICASSP, pp.2017-2020, 2002. ,
DOI : 10.1109/icassp.2002.5745028
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.91.6375
Audio Segmentation and Speaker Localization in Meeting Videos, 18th International Conference on Pattern Recognition (ICPR'06), pp.1150-1153, 2006. ,
DOI : 10.1109/ICPR.2006.283
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.653.970
Associating audio-visual activity cues in a dominance estimation framework, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008. ,
DOI : 10.1109/CVPRW.2008.4563178
Working with Very Sparse Data to Detect Speaker and Listener Participation in a Meetings Corpus, Workshop Programme, 2006. ,
Multimodal speaker diarization of real-world meetings using compressed-domain video features, Proc. ICASSP, pp.4069-4072, 2009. ,
Visual speaker localization aided by acoustic models, Proceedings of the seventeen ACM international conference on Multimedia, MM '09, pp.195-202, 2009. ,
DOI : 10.1145/1631272.1631301
Step-by-step and integrated approaches in broadcast news speaker diarization, CSL, selected papers from the Speaker and Language Recognition Workshop, pp.303-330, 2006. ,
DOI : 10.1016/j.csl.2005.08.002
URL : https://hal.archives-ouvertes.fr/hal-01318554
Combination of agglomerative and sequential clustering for speaker diarization, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4361-4364, 2008. ,
DOI : 10.1109/ICASSP.2008.4518621
Speaker diarization: combination of the LIUM and IRIT systems, 2008. ,
Combining Gaussianized/Non-Gaussianized Features to Improve Speaker Diarization of Telephone Conversations, Signal Processing letters, pp.1040-1043, 2007. ,
DOI : 10.1109/LSP.2007.905088
A Bayesian Analysis of Some Nonparametric Problems, The Annals of Statistics, vol.1, issue.2, pp.209-230, 1973. ,
DOI : 10.1214/aos/1176342360
Infinite models for speaker clustering, International Conference on Spoken Language Processing, pp.6-19, 2006. ,
Hierarchical Dirichlet Processes, Journal of the American Statistical Association, vol.101, issue.476, pp.1566-1581, 2006. ,
DOI : 10.1198/016214506000000302
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.122.8637
An HDP-HMM for systems with state persistence, Proceedings of the 25th international conference on Machine learning, ICML '08, 2008. ,
DOI : 10.1145/1390156.1390196
The blame game: performance analysis of speaker diarization system components, Proc. Interspeech, pp.1857-60, 2007. ,