CentralNet: a Multilayer Approach for Multimodal Fusion

Valentin Vielzeuf; Alexis Lechervy; Stéphane Pateux; Frédéric Jurie

Communication Dans Un Congrès Année : 2018

CentralNet: a Multilayer Approach for Multimodal Fusion

(1, 2) , (1) , (2) , (1)

1
2

Valentin Vielzeuf

Fonction : Auteur
PersonId : 1016921

Equipe Image - Laboratoire GREYC - UMR6072

Orange Labs R&D [Rennes]

Alexis Lechervy

Fonction : Auteur
PersonId : 7323
IdHAL : alexis-lechervy
ORCID : 0000-0002-9441-0187
IdRef : 16680746X

Equipe Image - Laboratoire GREYC - UMR6072

Stéphane Pateux

Fonction : Auteur
PersonId : 1016922

Orange Labs R&D [Rennes]

Frédéric Jurie

Fonction : Auteur
PersonId : 3233
IdHAL : frederic-jurie
ORCID : 0000-0002-2686-0020
IdRef : 080485022

Equipe Image - Laboratoire GREYC - UMR6072

Résumé

This paper proposes a novel multimodal fusion approach, aiming to produce best possible decisions by integrating information coming from multiple media. While most of the past multimodal approaches either work by projecting the features of different modalities into the same space, or by coordinating the representations of each modality through the use of constraints, our approach borrows from both visions. More specifically, assuming each modality can be processed by a separated deep convolutional network, allowing to take decisions independently from each modality, we introduce a central network linking the modality specific networks. This central network not only provides a common feature embedding but also regularizes the modality specific networks through the use of multi-task learning. The proposed approach is validated on 4 different computer vision tasks on which it consistently improves the accuracy of existing multimodal fusion approaches.

Mots clés

Multimodal Fusion Representation Learning Neural Networks Multi-task Learning

Domaines

Intelligence artificielle [cs.AI] Vision par ordinateur et reconnaissance de formes [cs.CV] Multimédia [cs.MM]

Fichier principal

eccv2018submission.pdf (749.41 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Valentin Vielzeuf : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01858560

Soumis le : mardi 21 août 2018-09:24:00

Dernière modification le : mercredi 20 mars 2024-16:20:04

Archivage à long terme le : jeudi 22 novembre 2018-12:49:41

Dates et versions

hal-01858560 , version 1 (21-08-2018)

Identifiants

HAL Id : hal-01858560 , version 1
ARXIV : 1808.07275

Citer

Valentin Vielzeuf, Alexis Lechervy, Stéphane Pateux, Frédéric Jurie. CentralNet: a Multilayer Approach for Multimodal Fusion. European Conference on Computer Vision Workshops: Multimodal Learning and Applications, Sep 2018, Munich, Germany. ⟨hal-01858560⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS GREYC GREYC-IMAGE COMUE-NORMANDIE ENSICAEN UNICAEN

360 Consultations

1724 Téléchargements

CentralNet: a Multilayer Approach for Multimodal Fusion

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager