Semantic Bag-of-Words Models for Visual Concept Detection and Annotation

Yu Zhang 1 Stéphane Bres 1 Liming Chen 1
1 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : This paper presents a novel method for building textual feature defined on semantic distance and describes multi-model approach for Visual Concept Detection and Annotation(VCDA). Nowadays, the tags associated with images have been popularly used in the VCDA task, because they contain valuable information about image content that can hardly be described by low-level visual features. Traditionally the term frequencies model is used to capture this useful text information. However, the shortcoming in the term frequencies model lies that the valuable semantic information can not be captured. To solve this problem, we propose the semantic bag-of-words(BoW) model which use WordNet-based distance to construct the codebook and assign the tags. The advantages of this approach are two-fold: (1) It can capture tags semantic information that is hardly described by the term frequencies model. (2) It solves the high dimensionality issue of the codebook vocabulary construction, reducing the size of the tags representation. Furthermore, we employ a strong Multiple Kernel Learning (MKL) classifier to fuse the visual model and the text model. The experimental results on the ImageCLEF 2011 show that our approach effectively improves the recognition accuracy.
Type de document :
Communication dans un congrès
8th International Conference on SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS 2012), Nov 2012, Sorrento-Naples, Italy. pp.289-295, 2012
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01353182
Contributeur : Équipe Gestionnaire Des Publications Si Liris <>
Soumis le : mercredi 10 août 2016 - 16:25:49
Dernière modification le : jeudi 11 août 2016 - 01:04:19

Identifiants

  • HAL Id : hal-01353182, version 1

Collections

Citation

Yu Zhang, Stéphane Bres, Liming Chen. Semantic Bag-of-Words Models for Visual Concept Detection and Annotation. 8th International Conference on SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS 2012), Nov 2012, Sorrento-Naples, Italy. pp.289-295, 2012. <hal-01353182>

Partager

Métriques

Consultations de la notice

54