THORN: Temporal Human-Object Relation Network for Action Recognition - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

THORN: Temporal Human-Object Relation Network for Action Recognition

Résumé

Most action recognition models treat human activities as unitary events. However, human activities often follow a certain hierarchy. In fact, many human activities are compositional. Also, these actions are mostly human-object interactions. In this paper we propose to recognize human action by leveraging the set of interactions that define an action. In this work, we present an end-to-end network: THORN, that can leverage important human-object and object-object interactions to predict actions. This model is built on top of a 3D backbone network. The key components of our model are: 1) An object representation filter for modeling object. 2) An object relation reasoning module to capture object relations. 3) A classification layer to predict the action labels. To show the robustness of THORN, we evaluate it on EPIC-Kitchen55 and EGTEA Gaze+, two of the largest and most challenging first-person and human-object interaction datasets. THORN achieves state-of-the-art performance on both datasets.
Fichier principal
Vignette du fichier
ICPR22_v2.pdf (900.15 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03698623 , version 1 (18-06-2022)

Identifiants

  • HAL Id : hal-03698623 , version 1

Citer

Mohammed Guermal, Rui Dai, Francois F Bremond. THORN: Temporal Human-Object Relation Network for Action Recognition. ICPR 2022 - International Conference on Pattern Recognition, Aug 2022, Montreal, Canada. ⟨hal-03698623⟩
39 Consultations
36 Téléchargements

Partager

Gmail Facebook X LinkedIn More