D. Y. Kim, G. Evermann, T. Hain, D. Mrva, S. Tranter et al., Recent advances in broadcast news transcription, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721), p.105110, 2003.
DOI : 10.1109/ASRU.2003.1318412

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.2088

L. Nguyen and B. Xiang, Light supervision in acoustic model training, Proceedings of International Conference on Acoustics Speech and Signal Processing, 2004.

M. Siu, R. Rohlicek, and H. Gish, An unsupervised, sequential learning algorithm for segmentation of speech waveforms with multi speakers, Proceedings of International Conference on Acoustics Speech and Signal Processing (ICASSP 92), p.189192, 1992.

L. Wilcox, F. Chen, D. Kimber, and V. Balasubramanian, Segmentation of speech using speaker identication, Proceedings of International Conference on Acoustics Speech and Signal Processing (ICASSP 94), p.161164, 1994.
DOI : 10.1109/icassp.1994.389330

M. Siegler, U. Jain, B. Raj, and R. Stern, Automatic segmentation and clustering of broadcast news audio, in: The DARPA Speech Recognition Workshop, 1997.

S. Chen and P. , Gopalakrishnan, Speaker, environment and channel change detection and clustering via the bayesian information criterion, in: DARPA Broadcast News Transcription and Understanding Workshop, 1998.

S. Meignier, J. Bonastre, and S. Igounet, E-HMM approach for learning and adapting sound models for speaker indexing, in: 2001 : a Speaker Odyssey. The Speaker Recognition Workshop, p.175180, 2001.

J. Ajmera and C. Wooters, A robust speaker clustering algorithm, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721), p.411416, 2003.
DOI : 10.1109/ASRU.2003.1318476

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.169.6147

D. A. Reynolds, R. B. Dunm, and J. J. Laughlin, The Lincoln speaker recognition system: NIST EVAL2000, Proceedings of International Conference on Spoken Language Processing, p.470473, 2000.

D. Moraru, S. Meignier, C. Fredouille, L. Besacier, and J. Bonastre, The ELISA consortium approaches in broadcast news speaker segmentation during the NIST 2003 rich transcription evaluation, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004.
DOI : 10.1109/ICASSP.2004.1326000

S. Nist, RT-04S) rich transcription meeting recognition evaluation plan, 2004.

G. Quénot, D. Moraru, L. Besacier, and P. Mulhem, Clips-imag at trec-11 : Experiments in video retrieval, 2002.

G. Quénot, D. Moraru, and L. Besacier, Clips at TRECvid: Shot boundary detection and feature detection, 2003.

I. Magrin-chagnolleau, G. Gravier, and R. Blouet, Overview of the ELISA consortium research activities, in: 2001 : a Speaker Odyssey. The Speaker Recognition Workshop, p.6772, 2001.

C. Fredouille, D. Moraru, S. Meignier, L. Besacier, and J. Bonastre, The NIST 2004 spring rich transcription evaluation : two-axis merging strategy in the context of multiple distance microphone based meeting speaker segmentation, RT2004 Spring Meeting Recognition Workshop, p.5, 2004.
URL : https://hal.archives-ouvertes.fr/hal-01434304

D. Moraru, S. Meignier, L. Besacier, J. Bonastre, and Y. Magrin-chagnolleau, The ELISA consortium approaches in speaker segmentation during the NIST 2002 speaker recognition evaluation, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)., p.8992, 2003.
DOI : 10.1109/ICASSP.2003.1202301

P. Delacourt and C. J. Welkens, DISTBIC: A speaker-based segmentation for audio data indexing, Speech Communication, vol.32, issue.1-2, p.111126, 2000.
DOI : 10.1016/S0167-6393(00)00027-3

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.66.6609

T. Hain and P. Woodland, Segmentation and classication of broadcast news audio, Proceedings of International Conference on Spoken Language Processing (ICSLP 98), 1998.

P. Woodland, The development of the HTK Broadcast News transcription system: An overview, Speech Communication, vol.37, issue.1-2, 2002.
DOI : 10.1016/S0167-6393(01)00059-0

J. Gauvain, L. Lamel, and G. Adda, The LIMSI Broadcast News transcription system, Speech Communication, vol.37, issue.1-2, 2002.
DOI : 10.1016/S0167-6393(01)00061-9

URL : https://hal.archives-ouvertes.fr/hal-01434493

G. Schwarz, Estimating the Dimension of a Model, The Annals of Statistics, vol.6, issue.2, p.461464, 1978.
DOI : 10.1214/aos/1176344136

S. Meignier, J. Bonastre, C. Fredouille, and T. Merlin, Evolutive HMM for multi-speaker tracking system, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), pp.1177-1180, 2000.
DOI : 10.1109/ICASSP.2000.859181

URL : https://hal.archives-ouvertes.fr/hal-01451542

D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, Speaker verication using adapted gaussian mixture models, Digital Signal Processing (DSP), a review journal - Special issue on NIST 1999 speaker recognition, pp.1-3, 1941.
DOI : 10.1006/dspr.1999.0361

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.117.338

L. Wilcox, D. Kimber, and F. Chen, Audio indexing using speaker identication, Proceedings SPIE Conference on Automatic Systems for the Inspection and Identication of Humans, p.149157, 1994.
DOI : 10.1117/12.191878

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.5263

J. Gauvain, L. Lamel, and G. Adda, Audio partitioning and transcription for broadcast data indexation, 2001.

A. Adami, S. S. Kajarekar, and H. Hermansky, A new speaker change detection method for two-speaker segmentation, IIEEE International Conference on Acoustics Speech and Signal Processing, p.39083911, 2002.
DOI : 10.1109/ICASSP.2002.1004772

J. Gauvain and C. H. Lee, Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE Transactions on Speech and Audio Processing, vol.2, issue.2, p.291298, 1994.
DOI : 10.1109/89.279278

P. Junqua, s speaker diarization, 2003.

Y. Moh, P. Nguyen, and J. Junqua, Towards domain independent speaker clustering, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)., 2003.
DOI : 10.1109/ICASSP.2003.1202300

D. Moraru, L. Besacier, and E. Castelli, Using a priori information for speaker diarization, in: 2004 : A Speaker Odyssey. The Speaker Recognition Workshop, p.355362, 2004.

S. Fae, S. Der-mie-fae, and S. Der-hand, false alarm speaker (F aE), speaker (SE) and diarization speaker (DER) error rates (in %), obtained by each speaker diarization system before applying the re-segmentation step when combined with dierent levels of acoustic macro-class segmentation. Experiments conducted on ELISA-Dev and ELISA-Eva corpora. Step-by-step system ELISA-Dev ELISA-Eva Acoustic segmentation MiE, Error rates