J. Glass, Towards unsupervised speech processing, Information Science, Signal Processing and their Applications (ISSPA), 2012 11th International Conference on, pp.1-4, 2012.

M. Dredze, A. Jansen, G. Coppersmith, and K. Church, Nlp on spoken documents without asr, Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp.460-470, 2010.

L. Zheng, C. Leung, L. Xie, B. Ma, and H. Li, Acoustic texttiling for story segmentation of spoken documents, ICASSP, pp.5121-5124, 2012.

J. White, D. Oard, A. Jansen, J. Paik, and R. Sankepally, Using zero-resource spoken term discovery for ranked retrieval, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.588-597, 2015.

L. Lee, J. Glass, H. Lee, and C. Chan, Spoken content retrievalbeyond cascading speech recognition with text retrieval, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue.9, pp.1389-1420, 2015.

A. Jansen and D. Garcia-romero,

J. Hernandez-cordero, Unsupervised idiolect discovery for speaker recognition

A. Jansen and K. Church, Towards unsupervised training of speaker independent acoustic models, Twelfth Annual Conference of the International Speech Communication Association, 2011.

H. Kamper, M. Elsner, A. Jansen, and S. Goldwater, Unsupervised neural network based feature extraction using weak top-down constraints, Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, pp.5818-5822, 2015.
DOI : 10.1109/icassp.2015.7179087

URL : https://www.pure.ed.ac.uk/ws/files/18703089/Kamper_Elsner_ET_AL_2015_Unsupervised_Neural_Network_Based_Feature_Extraction_Using_Weak_Top_Down_Constraints.pdf

G. Synnaeve, T. Schatz, and E. Dupoux, Phonetics embedding learning with side information, 2014 IEEE Workshop on Spoken Language Technology, vol.12, 2014.
DOI : 10.1109/slt.2014.7078558

J. Johnson, M. Douze, and H. Jégou, Billion-scale similarity search with gpus, 2017.
DOI : 10.1109/tbdata.2019.2921572

URL : http://arxiv.org/pdf/1702.08734

M. Versteegh, R. Thiolliere, T. Schatz, X. N. Cao, X. Anguera et al., The zero resource speech challenge 2015, Proc. of Interspeech, 2015.

E. Dunbar, X. N. Cao, J. Benjumea, J. Karadayi, M. Bernard et al., The zero resource speech challenge 2017, Automatic Speech Recognition and Understanding Workshop, pp.323-330, 2017.
DOI : 10.1109/asru.2017.8268953

URL : https://hal.archives-ouvertes.fr/hal-01687504

A. S. Park and J. R. Glass, Unsupervised Pattern Discovery in Speech, Speech, and Language Processing, vol.16, pp.186-197, 2008.
DOI : 10.1109/tasl.2007.909282

A. Muscariello, G. Gravier, and F. Bimbot, Audio keyword extraction by unsupervised word discovery, INTERSPEECH 2009: 10th Annual Conference of the International Speech Communication Association, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00551769

A. Muscariello, G. Gravier, and F. Bimbot, Unsupervised Motif Acquisition in Speech via Seeded Discovery and Template Matching Combination, IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, vol.20, pp.2031-2044, 2012.
DOI : 10.1109/tasl.2012.2194283

URL : https://hal.archives-ouvertes.fr/hal-00740978

A. Jansen and B. Van-durme, Efficient spoken term discovery using randomized algorithms, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, pp.401-406, 2011.
DOI : 10.1109/asru.2011.6163965

URL : http://www.cs.jhu.edu/%7Evandurme/papers/JansenVanDurmeASRU11.pdf

V. Lyzinski, G. Sell, and A. Jansen, An evaluation of graph clustering methods for unsupervised term discovery, INTERSPEECH, 2015.

B. Oosterveld, R. Veale, and M. Scheutz, A parallelized dynamic programming approach to zero resource spoken term discovery, Acoustics, Speech and Signal Processing, pp.5800-5804, 2017.
DOI : 10.1109/icassp.2017.7953268

H. Kamper, A. Jansen, and S. Goldwater, Fully unsupervised small-vocabulary speech recognition using a segmental bayesian model, Sixteenth Annual Conference of the International Speech Communication Association, 2015.
DOI : 10.1016/j.csl.2017.04.008

URL : http://arxiv.org/pdf/1606.06950

H. Kamper, A. Jansen, and S. Goldwater, A segmental framework for fully-unsupervised largevocabulary speech recognition, Computer Speech & Language, vol.46, pp.154-174, 2017.
DOI : 10.1016/j.csl.2017.04.008

URL : http://arxiv.org/pdf/1606.06950

M. R. Brent, An efficient, probabilistically sound algorithm for segmentation and word discovery, Machine Learning, vol.34, pp.71-105, 1999.

S. J. Goldwater, Nonparametric Bayesian models of lexical acquisition, 2007.

S. Goldwater, T. L. Griffiths, and M. Johnson, A Bayesian framework for word segmentation: Exploring the effects of context, Cognition, vol.112, issue.1, pp.21-54, 2009.

M. Norouzi, . David, and . Fleet, Cartesian kmeans, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3017-3024, 2013.

H. Hermansky, Perceptual linear predictive (PLP) analysis of speech, J. Acoust. Soc. Am, vol.87, issue.4, pp.1738-1752, 1990.
DOI : 10.1121/1.399423

N. Holzenberger, M. Du, J. Karadayi, R. Riad, and E. Dupoux, Learning word embeddings: unsupervised methods for fixed-size representations of variable-length speech segments, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01888708

H. Kamper, K. Livescu, and S. Goldwater, An embedded segmental k-means model for unsupervised segmentation and clustering of speech, Automatic Speech Recognition and Understanding Workshop, pp.719-726, 2017.

G. Okko-rasanen, M. Doyle, and . Frank, Unsupervised word discovery from speech using automatic segmentation into syllable-like units, 2015.