T. Virtanen, M. D. Plumbley, and D. Ellis, Computational analysis of sound scenes and events, 2018.
DOI : 10.1007/978-3-319-63450-0

G. Parascandolo, H. Huttunen, and T. Virtanen, Recurrent neural networks for polyphonic sound event detection in real life recordings, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.6440-6444, 2016.
DOI : 10.1109/ICASSP.2016.7472917

N. Takahashi, M. Gygli, B. Pfister, and L. Van-gool, Deep convolutional neural networks and data augmentation for acoustic event detection, Proc. INTERSPEECH, 2016.

S. Adavanne, G. Parascandolo, P. Pertilä, T. Heittola, and T. Virtanen, Sound event detection in multichannel audio using spatial and harmonic features, 2017.

E. Cakir, G. Parascandolo, T. Heittola, H. Huttunen, and T. Virtanen, Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.25, issue.6, pp.1291-1303, 2017.
DOI : 10.1109/TASLP.2017.2690575

Y. Xu, Q. Kong, W. Wang, and M. D. Plumbley, Large-scale weakly supervised audio classification using gated convolutional neural network, Proc. DCASE, 2017.

A. Mesaros, T. Heittola, A. Diment, B. Elizalde, A. Shah et al., Dcase 2017 challenge setup: Tasks, datasets and baseline system, Proc. DCASE, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01627981

J. F. Gemmeke, D. P. Ellis, D. Freedman, A. Jansen, W. Lawrence et al., Audio Set: An ontology and human-labeled dataset for audio events, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
DOI : 10.1109/ICASSP.2017.7952261

J. Salamon and J. P. Bello, Unsupervised feature learning for urban sound classification, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.171-175, 2015.
DOI : 10.1109/ICASSP.2015.7177954

A. Jansen, M. Plakal, R. Pandya, D. Ellis, S. Hershey et al., Unsupervised learning of semantic audio representations, Proc. ICASSP, 2018.

Z. Zhang and B. Schuller, Semi-supervised learning helps in sound event classification, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.333-336
DOI : 10.1109/ICASSP.2012.6287884

T. Komatsu, T. Toizumi, R. Kondo, and Y. Senda, Acoustic event detection method using semi-supervised non-negative matrix factorization with a mixture of local dictionaries, Proc. DCASE), 2016, pp.45-49

B. Elizalde, A. Shah, S. Dalmia, M. H. Lee, R. Badlani et al., An approach for self-training audio event detectors using web data, 2017 25th European Signal Processing Conference (EUSIPCO), pp.1863-1867, 2017.
DOI : 10.23919/EUSIPCO.2017.8081532

I. Jeong, S. Lee, Y. Han, and K. Lee, Audio event detection using multiple-input convolutional neural network, Proc. DCASE, pp.51-54, 2017.

J. Lee, J. Park, S. Kum, Y. Jeong, and J. Nam, Combining multi-scale features using sample-level deep convolutional neural networks for weakly supervised sound event detection, Proc. DCASE, pp.69-73, 2017.

]. A. Mesaros, T. Heittola, and T. Virtanen, Metrics for Polyphonic Sound Event Detection, Applied Sciences, vol.6913, issue.6, p.162, 2016.
DOI : 10.1109/TASL.2009.2032947