Modout: Learning to Fuse Face and Gesture Modalities with Stochastic Regularization

Fan Li; Natalia Neverova; Christian Wolf; Graham W. Taylor

Communication Dans Un Congrès Année : 2017

Modout: Learning to Fuse Face and Gesture Modalities with Stochastic Regularization

(1) , (2) , (3) , (1)

1
2
3

Fan Li

Fonction : Auteur
PersonId : 780654
IdRef : 223375799

University of Guelph

Natalia Neverova

Fonction : Auteur
PersonId : 4769
IdHAL : neverova-natalia
IdRef : 197507905

Facebook AI Research [Paris]

Christian Wolf

Fonction : Auteur
PersonId : 3860
IdHAL : christian-wolf
ORCID : 0000-0001-9766-3211
IdRef : 083311696

Extraction de Caractéristiques et Identification

Graham W. Taylor

Fonction : Auteur
PersonId : 968487

University of Guelph

Résumé

Model selection methods based on stochastic regularization such as Dropout have been widely used in deep learning due to their simplicity and effectiveness. The standard Dropout method treats all units, visible or hidden, in the same way, thus ignoring any \emph{a priori} information related to grouping or structure. Such structure is present in multi-modal learning applications such as affect analysis and gesture recognition, where subsets of units may correspond to individual modalities. In this paper we describe Modout, a model selection method based on stochastic regularization, which is particularly useful in the multi-modal setting. Different from previous methods, it is capable of learning whether or when to fuse two modalities in a layer, which is usually considered to be an architectural hyper-parameter by deep learning researchers and practitioners. Modout is evaluated on one synthetic and two real multi-modal datasets. The results indicate improved performance compared to other stochastic regularization methods. The result on the Montalbano dataset shows that learning a fusion structure by Modout is on par with a state-of-the-art carefully designed architecture.

Mots clés

deep learning gesture recognition

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Christian Wolf : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01444614

Soumis le : mardi 24 janvier 2017-11:27:01

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Dates et versions

hal-01444614 , version 1 (24-01-2017)

Identifiants

HAL Id : hal-01444614 , version 1

Citer

Fan Li, Natalia Neverova, Christian Wolf, Graham W. Taylor. Modout: Learning to Fuse Face and Gesture Modalities with Stochastic Regularization . International Conference on Automatic Face and Gesture Recognition, May 2017, Washington D.C., United States. ⟨hal-01444614⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS LABEXIMU INSA-GROUPE UDL

343 Consultations

0 Téléchargements

Modout: Learning to Fuse Face and Gesture Modalities with Stochastic Regularization

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager