IRIM at TRECVID 2014: Semantic Indexing and Instance Search

Abstract : The IRIM group is a consortium of French teams supported by the GDR ISIS and working on Multimedia Indexing and Retrieval. This paper describes its participation to the TRECVID 2014 semantic indexing (SIN) and instance search (INS) tasks. For the semantic indexing task, our approach uses a six-stages processing pipelines for computing scores for the likelihood of a video shot to contain a target concept. These scores are then used for producing a ranked list of images or shots that are the most likely to contain the target concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classification, fusion of descriptor variants, higher-level fusion, and re-ranking. We evaluated a number of different descriptors and tried different fusion strategies. The best IRIM run has a Mean Inferred Average Precision of 0.2796, which ranked us 5th out of 15 participants. For INS 2014 task IRIM participation, the classical BoW approach was followed, trained only with east-enders dataset. Shot signatures were computed on one key frame, or several key frames (at 1fps) and average pooling. A dissimilarity, computing a distance only for words present in query, was tested. A saliency map, build from object ROI to incorporate background context, was tried. Late fusion of two individual BoW results, with different detectors/descriptors (Hessian-Affine/SIFT and Harris-Laplace/Opponent SIFT), was used. The four submitted runs were the following: - Run F_D_IRIM_1 was the late fusion of BOW with SIFT, dissimilarity L2p, on several key frames per shot, with context for queries, and BOW with Opponent SIFT, dissimilarity L1p, on one key frame per shot. - Run F_D_IRIM_2 was similar to F_D_IRIM_1 but context for queries used also for second BoW. - Run F_D_IRIM_3 was similar to F_D_IRIM_1 but no context for queries used. - Run F_D_IRIM_4 was similar to F_D_IRIM_2 but using delta1 dissimilarity [46] (from INS 2013 best run). We found that extracting several key frames per shot coupled with average pooling improved results. We confirmed than including context in queries was also beneficial. Surprisingly, our dissimilarity performed better than delta1.
Type de document :
Communication dans un congrès
Proceedings of TRECVID, Nov 2014, Orlando, United States. 2014
Liste complète des métadonnées

Littérature citée [57 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01132491
Contributeur : Georges Quénot <>
Soumis le : mardi 17 mars 2015 - 13:13:26
Dernière modification le : vendredi 10 novembre 2017 - 01:19:34
Document(s) archivé(s) le : lundi 17 avril 2017 - 16:28:14

Fichier

irim.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01132491, version 1

Citation

Nicolas Ballas, Benjamin Labbé, Hervé Le Borgne, Philippe Gosselin, David Picard, et al.. IRIM at TRECVID 2014: Semantic Indexing and Instance Search. Proceedings of TRECVID, Nov 2014, Orlando, United States. 2014. 〈hal-01132491〉

Partager

Métriques

Consultations de la notice

783

Téléchargements de fichiers

419