Hand Pose Estimation through Weakly-Supervised Learning of a Rich Intermediate Representation

Natalia Neverova; Christian Wolf; Florian Nebout; Graham W. Taylor

Rapport (Rapport De Recherche) Année : 2015

Hand Pose Estimation through Weakly-Supervised Learning of a Rich Intermediate Representation

(1) , (1) , (2) , (3)

1
2
3

Natalia Neverova

Fonction : Auteur
PersonId : 4769
IdHAL : neverova-natalia
IdRef : 197507905

Extraction de Caractéristiques et Identification

Christian Wolf

Fonction : Auteur
PersonId : 3860
IdHAL : christian-wolf
ORCID : 0000-0001-9766-3211
IdRef : 083311696

Extraction de Caractéristiques et Identification

Florian Nebout

Fonction : Auteur

Société Awabot

Graham W. Taylor

Fonction : Auteur
PersonId : 968487

University of Guelph

Résumé

We propose a method for hand pose estimation based on a deep regressor trained on two different kinds of input. Raw depth data is fused with an intermediate representation in the form of a segmentation of the hand into parts. This intermediate representation contains important topological information and provides useful cues for reasoning about joint locations. The mapping from raw depth to segmentation maps is learned in a semi/weakly-supervised way from two different datasets: (i) a synthetic dataset created through a rendering pipeline including densely labeled ground truth (pixelwise segmentations); and (ii) a dataset with real images for which ground truth joint positions are available, but not dense segmentations. Loss for training on real images is generated from a patch-wise restoration process, which aligns tentative segmentation maps with a large dictionary of synthetic poses. The underlying premise is that the domain shift between synthetic and real data is smaller in the intermediate representation, where labels carry geometric and topological meaning, than in the raw input domain. Experiments on the NYU dataset show that the proposed training method decreases error on joints over direct regression of joints from depth data by 15.7%.

Mots clés

hand pose deep learning deep regression

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Christian Wolf : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01281943

Soumis le : jeudi 3 mars 2016-09:40:33

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Dates et versions

hal-01281943 , version 1 (03-03-2016)

Identifiants

HAL Id : hal-01281943 , version 1
ARXIV : 1511.06728

Citer

Natalia Neverova, Christian Wolf, Florian Nebout, Graham W. Taylor. Hand Pose Estimation through Weakly-Supervised Learning of a Rich Intermediate Representation. [Research Report] INSA Lyon. 2015. ⟨hal-01281943⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS LARA LABEXIMU INSA-GROUPE UDL

129 Consultations

0 Téléchargements

Hand Pose Estimation through Weakly-Supervised Learning of a Rich Intermediate Representation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager