W. Jiang, C. Cotton, S. Chang, D. Ellis, and A. Loui, Short-term audio-visual atoms for generic video concept classification, Proceedings of the seventeen ACM international conference on Multimedia, MM '09, 2009.
DOI : 10.1145/1631272.1631277

G. Ye, I. Jhuo, D. Liu, Y. Jiang, D. Lee et al., Joint audio-visual bi-modal codewords for video event detection, Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR '12, 2012.
DOI : 10.1145/2324796.2324843

M. Mühling, R. Ewerth, J. Zhou, and B. Freisleben, Multimodal Video Concept Detection via Bag of Auditory Words and Multiple Kernel Learning, Proceedings of the International Conference on Advances in Multimedia Modeling, 2012.
DOI : 10.1109/78.258082

P. Natarajan, S. Wu, S. N. Vitaladevuni, X. Zhuang, S. Tsakalidis et al., Multimodal feature fusion for robust event detection in web videos, 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012.
DOI : 10.1109/CVPR.2012.6247814

Q. Wu, Z. Wang, F. Deng, and D. D. Feng, Realistic Human Action Recognition with Audio Context, 2010 International Conference on Digital Image Computing: Techniques and Applications, 2010.
DOI : 10.1109/DICTA.2010.57

J. Sanchez-riera, X. Alameda-pineda, and R. Horaud, Audio-visual robot command recognition, Proceedings of the 14th ACM international conference on Multimodal interaction, ICMI '12, 2012.
DOI : 10.1145/2388676.2388760
URL : https://hal.archives-ouvertes.fr/hal-00768761

J. Lopes and S. Singh, Audio and Video Feature Fusion for Activity Recognition in Unconstrained Videos, Intelligent Data Engineering and Automated Learning, 2006.
DOI : 10.1007/11875581_99

I. Laptev, On space-time interest points, International Journal on Computer Vision, vol.64, issue.2, 2005.
DOI : 10.1007/s11263-005-1838-7
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.1419

. Horaud, Ravel: An annotated corpus for training robots with audio visual abilities, Journal of Multimodal User Interfaces, 2012.

S. Sonnenburg, G. Rätsch, S. Henschel, C. Widmer, J. Behr et al., The shogun machine learning toolbox, Journal of Machine Learning Research, vol.99, pp.1799-1802, 2010.