From text vocabularies to visual vocabularies: what basis? - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

From text vocabularies to visual vocabularies: what basis?

Résumé

The popular ”bag-of-visual-words” approach for representing and searching visual documents consists in de- scribing images (or video keyframes) using a set of descriptors, that correspond to quantized low-level features. Most of existing approaches for visual words are inspired from works in text indexing, based on the implicit assumption that visual words can be handled the same way as text words. More specifically, these techniques implicitly rely on the same postulate as in text information retrieval, stating that the words distribution for a natural language globally follows Zipf’s law – that is to say, words from a natural language appear in a corpus with a frequency inversely proportional to their rank. However, our study shows that the visual words distribution depends on the choice of low-level features, and also especially on the choice of the clustering method. We also show that when the visual words distribution is close to this of text words, the results of an image retrieval system are increased. To the best of our knowledge, no prior study has yet been carried out to compare the distributions of text words and visual words, with the objective of establishing the theoretical foundations of visual vocabularies.
Fichier non déposé

Dates et versions

hal-01532717 , version 1 (02-06-2017)

Identifiants

  • HAL Id : hal-01532717 , version 1

Citer

Jean Martinet. From text vocabularies to visual vocabularies: what basis?. International Conference on Computer Vision Theory and Applications, Jan 2014, Lisbon, Portugal. pp.668-675. ⟨hal-01532717⟩
76 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More