Leveraging image, text and cross-media similarities for diversity-focused multimedia retrieval

Julien Ah-Pine; Stephane Clinchant; Gabriela Csurka; Florent Perronnin; Jean-Michel Renders

Chapitre D'ouvrage Année : 2010

Leveraging image, text and cross-media similarities for diversity-focused multimedia retrieval

, (1) , (1) , (1) , (1)

Julien Ah-Pine

Fonction : Auteur
PersonId : 15506
IdHAL : julien-ah-pine
ORCID : 0000-0001-6898-3961

Stephane Clinchant

Fonction : Auteur

Xerox Research Centre Europe [Meylan]

Gabriela Csurka

Fonction : Auteur
PersonId : 930610

Xerox Research Centre Europe [Meylan]

Florent Perronnin

Fonction : Auteur

Xerox Research Centre Europe [Meylan]

Jean-Michel Renders

Fonction : Auteur

Xerox Research Centre Europe [Meylan]

Résumé

This chapter summarizes the different cross–modal information retrieval techniques Xerox Research Centre implemented during three years of participation in ImageCLEF Photo tasks. The main challenge remained constant: how to optimally couple visual and textual similarities, when they capture things at different semantic levels and when one of the media (the textual one) gives, most of the time, much better retrieval performance. Some core components turned out to be very effective all over the years: the visual similarity metrics based on Fisher Vector representation of images and the cross–media similarity principle based on relevance models. However, other components were introduced to solve additional issues: We tried different query– and document–enrichment methods by exploiting auxiliary resources such as Flickr or open–source thesauri, or by doing some statistical ‘semantic smoothing’. We also implemented some clustering mechanisms in order to promote diversity in the top results and to provide faster access to relevant information. This chapter describes, analyses and assesses each of these components, namely: the monomodal similarity measures, the different cross–media similarities, the query and document enrichment, and finally the mechanisms to ensure diversity in what is proposed to the user. To conclude, we discuss the numerous lessons we have learnt over the years by trying to solve this very challenging task.

Domaines

Recherche d'information [cs.IR] Intelligence artificielle [cs.AI]

Fichier principal

book4onechapter.pdf (791.04 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Julien Ah-Pine : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01504565

Soumis le : lundi 10 avril 2017-12:22:52

Dernière modification le : mardi 12 février 2019-10:30:06

Archivage à long terme le : mardi 11 juillet 2017-12:39:50

Dates et versions

hal-01504565 , version 1 (10-04-2017)

Identifiants

HAL Id : hal-01504565 , version 1

Citer

Julien Ah-Pine, Stephane Clinchant, Gabriela Csurka, Florent Perronnin, Jean-Michel Renders. Leveraging image, text and cross-media similarities for diversity-focused multimedia retrieval. ImageCLEF - Experimental Evaluation in Visual Information Retrieval, Springer, pp.315-342, 2010, 978-3-642-15181-1. ⟨hal-01504565⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

28 Consultations

331 Téléchargements

Leveraging image, text and cross-media similarities for diversity-focused multimedia retrieval

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager