HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Evaluation of local spatio-temporal features for action recognition

Heng Wang 1, 2 Muhammad Muneeb Ullah 3 Alexander Klaser 1 Ivan Laptev 3 Cordelia Schmid 1
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology
3 VISTAS - Spatio-Temporal Vision and Learning
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Local space-time features have recently become a popular video representation for action recognition. Several methods for feature localization and description have been proposed in the literature and promising recognition results were demonstrated for a number of action classes. The comparison of existing methods, however, is often limited given the different experimental settings used. The purpose of this paper is to evaluate and compare previously proposed space-time features in a common experimental setup. In particular, we consider four different feature detectors and six local feature descriptors and use a standard bag-of-features SVM approach for action recognition. We investigate the performance of these methods on a total of 25 action classes distributed over three datasets with varying difficulty. Among interesting conclusions, we demonstrate that regular sampling of space-time features consistently outperforms all tested space-time interest point detectors for human actions in realistic settings. We also demonstrate a consistent ranking for the majority of methods over different datasets and discuss their advantages and limitations.
Document type :
Conference papers
Complete list of metadata

Contributor : Thoth Team Connect in order to contact the contributor
Submitted on : Friday, April 8, 2011 - 2:18:13 PM
Last modification on : Thursday, January 20, 2022 - 5:28:03 PM
Long-term archiving on: : Thursday, November 8, 2012 - 3:45:41 PM



Heng Wang, Muhammad Muneeb Ullah, Alexander Klaser, Ivan Laptev, Cordelia Schmid. Evaluation of local spatio-temporal features for action recognition. BMVC 2009 - British Machine Vision Conference, Sep 2009, London, United Kingdom. pp.124.1-124.11, ⟨10.5244/C.23.124⟩. ⟨inria-00439769⟩



Record views


Files downloads