Skip to Main content Skip to Navigation
Conference papers

Semantic Bag-of-Words Models for Visual Concept Detection and Annotation

Yu Zhang 1 Stéphane Bres 1 Liming Chen 1
1 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : This paper presents a novel method for building textual feature defined on semantic distance and describes multi-model approach for Visual Concept Detection and Annotation(VCDA). Nowadays, the tags associated with images have been popularly used in the VCDA task, because they contain valuable information about image content that can hardly be described by low-level visual features. Traditionally the term frequencies model is used to capture this useful text information. However, the shortcoming in the term frequencies model lies that the valuable semantic information can not be captured. To solve this problem, we propose the semantic bag-of-words(BoW) model which use WordNet-based distance to construct the codebook and assign the tags. The advantages of this approach are two-fold: (1) It can capture tags semantic information that is hardly described by the term frequencies model. (2) It solves the high dimensionality issue of the codebook vocabulary construction, reducing the size of the tags representation. Furthermore, we employ a strong Multiple Kernel Learning (MKL) classifier to fuse the visual model and the text model. The experimental results on the ImageCLEF 2011 show that our approach effectively improves the recognition accuracy.
Document type :
Conference papers
Complete list of metadata
Contributor : Équipe Gestionnaire Des Publications Si Liris <>
Submitted on : Wednesday, August 10, 2016 - 4:25:49 PM
Last modification on : Wednesday, July 8, 2020 - 12:43:45 PM


  • HAL Id : hal-01353182, version 1


Yu Zhang, Stéphane Bres, Liming Chen. Semantic Bag-of-Words Models for Visual Concept Detection and Annotation. 8th International Conference on SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS 2012), Nov 2012, Sorrento-Naples, Italy. pp.289-295. ⟨hal-01353182⟩



Record views