CentralNet: a Multilayer Approach for Multimodal Fusion

Valentin Vielzeuf 1, 2 Alexis Lechervy 1 Stéphane Pateux 2 Frédéric Jurie 1
1 Equipe Image - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
Abstract : This paper proposes a novel multimodal fusion approach, aiming to produce best possible decisions by integrating information coming from multiple media. While most of the past multimodal approaches either work by projecting the features of different modalities into the same space, or by coordinating the representations of each modality through the use of constraints, our approach borrows from both visions. More specifically, assuming each modality can be processed by a separated deep convolutional network, allowing to take decisions independently from each modality, we introduce a central network linking the modality specific networks. This central network not only provides a common feature embedding but also regularizes the modality specific networks through the use of multi-task learning. The proposed approach is validated on 4 different computer vision tasks on which it consistently improves the accuracy of existing multimodal fusion approaches.
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01858560
Contributor : Valentin Vielzeuf <>
Submitted on : Tuesday, August 21, 2018 - 9:24:00 AM
Last modification on : Thursday, February 7, 2019 - 5:46:42 PM
Document(s) archivé(s) le : Thursday, November 22, 2018 - 12:49:41 PM

Files

eccv2018submission.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01858560, version 1
  • ARXIV : 1808.07275

Citation

Valentin Vielzeuf, Alexis Lechervy, Stéphane Pateux, Frédéric Jurie. CentralNet: a Multilayer Approach for Multimodal Fusion. European Conference on Computer Vision Workshops: Multimodal Learning and Applications, Sep 2018, Munich, Germany. ⟨hal-01858560⟩

Share

Metrics

Record views

138

Files downloads

618