Audio-Visual Robot Command Recognition - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Audio-Visual Robot Command Recognition

Résumé

This paper addresses the problem of audio-visual command recognition in the framework of the D-META Grand Challenge. Temporal and non-temporal learning models are trained on visual and auditory descriptors. In order to set a proper baseline, the methods are tested on the ''Robot Gestures'' scenario of the publicly available RAVEL data set, following the leave-one-out cross-validation strategy. The classification-level audio-visual fusion strategy allows for compensating the errors of the unimodal (audio or vision) classifiers. The obtained results (an average audio-visual recognition rate of almost 80%) encourage us to investigate on how to further develop and improve the methodology described in this paper.
Fichier principal
Vignette du fichier
gcp03-pineda.pdf (259.03 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00768761 , version 1 (23-12-2012)

Identifiants

Citer

Jordi Sanchez-Riera, Xavier Alameda-Pineda, Radu Horaud. Audio-Visual Robot Command Recognition. ICMI 2012 - 14th ACM International Conference on Multimodal Interaction, Oct 2012, Santa-Monica, CA, United States. pp.371-378, ⟨10.1145/2388676.2388760⟩. ⟨hal-00768761⟩
165 Consultations
222 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More