Fusion of Multimodal Embeddings for Ad-Hoc Video Search

Danny Francis; Phuong Anh Nguyen; Benoit Huet; Chong-Wah Ngo

doi:10.1109/ICCVW.2019.00233

Communication Dans Un Congrès Année : 2019

Fusion of Multimodal Embeddings for Ad-Hoc Video Search

(1) , (2) , (1) , (2)

1
2

Danny Francis

Fonction : Auteur
PersonId : 1125183

Eurecom [Sophia Antipolis]

Phuong Anh Nguyen

Fonction : Auteur

City University of Hong Kong [Hong Kong]

Benoit Huet

Fonction : Auteur
PersonId : 1084757

Eurecom [Sophia Antipolis]

Chong-Wah Ngo

Fonction : Auteur

City University of Hong Kong [Hong Kong]

Résumé

The challenge of Ad-Hoc Video Search (AVS) originates from free-form (i.e., no pre-defined vocabulary) and freestyle (i.e., natural language) query description. Bridging the semantic gap between AVS queries and videos becomes highly difficult as evidenced from the low retrieval accuracy of AVS benchmarking in TRECVID. In this paper, we study a new method to fuse multimodal embeddings which have been derived based on completely disjoint datasets. This method is tested on two datasets for two distinct tasks: on MSR-VTT for unique video retrieval and on V3C1 for multiple videos retrieval.

Domaines

Ingénierie assistée par ordinateur

Fichier principal

publi-6052.pdf (391.89 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Centre De Documentation Eurecom : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03555283

Soumis le : jeudi 3 février 2022-15:10:49

Dernière modification le : jeudi 17 février 2022-03:33:48

Archivage à long terme le : mercredi 4 mai 2022-20:21:12

Dates et versions

hal-03555283 , version 1 (03-02-2022)

Identifiants

HAL Id : hal-03555283 , version 1
DOI : 10.1109/ICCVW.2019.00233

Citer

Danny Francis, Phuong Anh Nguyen, Benoit Huet, Chong-Wah Ngo. Fusion of Multimodal Embeddings for Ad-Hoc Video Search. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Oct 2019, Seoul, South Korea. pp.1868-1872, ⟨10.1109/ICCVW.2019.00233⟩. ⟨hal-03555283⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EURECOM ANR

26 Consultations

11 Téléchargements

Fusion of Multimodal Embeddings for Ad-Hoc Video Search

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager