Short-term audio-visual atoms for generic video concept classification, Proceedings of the seventeen ACM international conference on Multimedia, MM '09, 2009. ,
DOI : 10.1145/1631272.1631277
Joint audio-visual bi-modal codewords for video event detection, Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR '12, 2012. ,
DOI : 10.1145/2324796.2324843
Multimodal Video Concept Detection via Bag of Auditory Words and Multiple Kernel Learning, Proceedings of the International Conference on Advances in Multimedia Modeling, 2012. ,
DOI : 10.1109/78.258082
Multimodal feature fusion for robust event detection in web videos, 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012. ,
DOI : 10.1109/CVPR.2012.6247814
Realistic Human Action Recognition with Audio Context, 2010 International Conference on Digital Image Computing: Techniques and Applications, 2010. ,
DOI : 10.1109/DICTA.2010.57
Audio-visual robot command recognition, Proceedings of the 14th ACM international conference on Multimodal interaction, ICMI '12, 2012. ,
DOI : 10.1145/2388676.2388760
URL : https://hal.archives-ouvertes.fr/hal-00768761
Audio and Video Feature Fusion for Activity Recognition in Unconstrained Videos, Intelligent Data Engineering and Automated Learning, 2006. ,
DOI : 10.1007/11875581_99
On space-time interest points, International Journal on Computer Vision, vol.64, issue.2, 2005. ,
DOI : 10.1007/s11263-005-1838-7
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.1419
Ravel: An annotated corpus for training robots with audio visual abilities, Journal of Multimodal User Interfaces, 2012. ,
The shogun machine learning toolbox, Journal of Machine Learning Research, vol.99, pp.1799-1802, 2010. ,