Leveraging image, text and cross-media similarities for diversity-focused multimedia retrieval - Archive ouverte HAL Accéder directement au contenu
Chapitre D'ouvrage Année : 2010

Leveraging image, text and cross-media similarities for diversity-focused multimedia retrieval

Résumé

This chapter summarizes the different cross–modal information retrieval techniques Xerox Research Centre implemented during three years of participation in ImageCLEF Photo tasks. The main challenge remained constant: how to optimally couple visual and textual similarities, when they capture things at different semantic levels and when one of the media (the textual one) gives, most of the time, much better retrieval performance. Some core components turned out to be very effective all over the years: the visual similarity metrics based on Fisher Vector representation of images and the cross–media similarity principle based on relevance models. However, other components were introduced to solve additional issues: We tried different query– and document–enrichment methods by exploiting auxiliary resources such as Flickr or open–source thesauri, or by doing some statistical ‘semantic smoothing’. We also implemented some clustering mechanisms in order to promote diversity in the top results and to provide faster access to relevant information. This chapter describes, analyses and assesses each of these components, namely: the monomodal similarity measures, the different cross–media similarities, the query and document enrichment, and finally the mechanisms to ensure diversity in what is proposed to the user. To conclude, we discuss the numerous lessons we have learnt over the years by trying to solve this very challenging task.
Fichier principal
Vignette du fichier
book4onechapter.pdf (791.04 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01504565 , version 1 (10-04-2017)

Identifiants

  • HAL Id : hal-01504565 , version 1

Citer

Julien Ah-Pine, Stephane Clinchant, Gabriela Csurka, Florent Perronnin, Jean-Michel Renders. Leveraging image, text and cross-media similarities for diversity-focused multimedia retrieval. ImageCLEF - Experimental Evaluation in Visual Information Retrieval, Springer, pp.315-342, 2010, 978-3-642-15181-1. ⟨hal-01504565⟩
28 Consultations
331 Téléchargements

Partager

Gmail Facebook X LinkedIn More