MoE-SPNet: A mixture-of-experts scene parsing network

Huan Fu; Mingming Gong; Chaohui Wang; Dacheng Tao

doi:10.1016/j.patcog.2018.07.020

Article Dans Une Revue Pattern Recognition Année : 2018

MoE-SPNet: A mixture-of-experts scene parsing network

(1) , (2) , (3, 4) , (1)

1
2
3
4

Huan Fu

Fonction : Auteur

The University of Sydney

Mingming Gong

Fonction : Auteur

University of Pittsburgh

Chaohui Wang

Fonction : Auteur
PersonId : 173129
IdHAL : chaohui-wang

Laboratoire d'Informatique Gaspard-Monge

imagine [Marne-la-Vallée]

Dacheng Tao

Fonction : Auteur

The University of Sydney

Résumé

Scene parsing is an indispensable component in understanding the semantics within a scene. Traditional methods rely on handcrafted local features and probabilistic graphical models to incorporate local and global cues. Recently, methods based on fully convolutional neural networks have achieved new records on scene parsing. An important strategy common to these methods is the aggregation of hierarchical features yielded by a deep convolutional neural network. However, typical algorithms usually aggregate hierarchical convolutional features via concatenation or linear combination, which cannot sufficiently exploit the diversities of contextual information in multi-scale features and the spatial inhomogeneity of a scene. In this paper, we propose a mixture-of-experts scene parsing network (MoE-SPNet) that incorporates a convolutional mixture-of-experts layer to assess the importance of features from different levels and at different spatial locations. In addition, we propose a variant of mixture-of-experts called the adaptive hierarchical feature aggregation (AHFA) mechanism which can be incorporated into existing scene parsing networks that use skip-connections to fuse features layer-wisely. In the proposed networks, different levels of features at each spatial location are adaptively re-weighted according to the local structure and surrounding contextual information before aggregation. We demonstrate the effectiveness of the proposed methods on two scene parsing datasets including PASCAL VOC 2012 and SceneParse150 based on two kinds of baseline models FCN-8s and DeepLab-ASPP.

Domaines

Intelligence artificielle [cs.AI] Vision par ordinateur et reconnaissance de formes [cs.CV]

Chaohui Wang : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02086416

Soumis le : lundi 1 avril 2019-12:43:35

Dernière modification le : jeudi 28 mars 2024-03:27:33

Dates et versions

hal-02086416 , version 1 (01-04-2019)

Identifiants

HAL Id : hal-02086416 , version 1
ARXIV : 1806.07049
DOI : 10.1016/j.patcog.2018.07.020

Citer

Huan Fu, Mingming Gong, Chaohui Wang, Dacheng Tao. MoE-SPNet: A mixture-of-experts scene parsing network. Pattern Recognition, 2018, ⟨10.1016/j.patcog.2018.07.020⟩. ⟨hal-02086416⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENPC CNRS LIGM_A3SI PARISTECH LIGM IMAGINE UNIV-EIFFEL JSE2024

68 Consultations

0 Téléchargements

MoE-SPNet: A mixture-of-experts scene parsing network

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager