Language modeling for bag-of-visual words image categorization

Pierre Tirilly 1 Vincent Claveau 1 Patrick Gros 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In this paper, we propose two ways of improving image classification based on bag-of-words representation. Two shortcomings of this representation are the loss of the spatial information of visual words and the presence of noisy visual words due to the coarseness of the vocabulary building process. On the one hand, we propose a new representation of images that goes further in the analogy with textual data: visual sentences, that allows us to "read" visual words in a certain order, as in the case of text. We can therefore consider simple spatial relations between words. We also present a new image classification scheme that exploits these relations. It is based on the use of language models, a very popular tool from speech and text analysis communities. On the other hand, we propose new techniques to eliminate useless words, one based on geometric properties of the keypoints, the other on the use of probabilistic Latent Semantic Analysis (pLSA). Experiments show that our techniques can significantly improve image classification, compared to a classical Support Vector Machine-based classification.
Document type :
Conference papers
Complete list of metadatas
Contributor : Pierre Tirilly <>
Submitted on : Thursday, April 11, 2013 - 12:40:24 PM
Last modification on : Friday, November 16, 2018 - 1:23:22 AM

Links full text



Pierre Tirilly, Vincent Claveau, Patrick Gros. Language modeling for bag-of-visual words image categorization. International Conference on Image and Video Retrieval, Jul 2008, Niagara Falls, Canada. pp.249-258, ⟨10.1145/1386352.1386388⟩. ⟨hal-00811922⟩



Record views