Combining Geometric, Textual and Visual Features for Predicting Prepositions in Image Descriptions

Abstract : We investigate the role that geometric, textual and visual features play in the task of predicting a preposition that links two visual entities depicted in an image. The task is an important part of the subsequent process of generating image descriptions. We explore the prediction of prepositions for a pair of entities, both in the case when the labels of such entities are known and unknown. In all situations we found clear evidence that all three features contribute to the prediction task.
Type de document :
Communication dans un congrès
Conference on Empirical Methods in Natural Language Processing, Sep 2015, Lisbon, Portugal. pp.214-220, 2015
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01375638
Contributeur : Emmanuel Dellandrea <>
Soumis le : lundi 3 octobre 2016 - 13:12:52
Dernière modification le : mardi 4 octobre 2016 - 01:04:49

Identifiants

  • HAL Id : hal-01375638, version 1

Collections

Citation

Arnau Ramisa, Josiah Wang, Ying Lu, Emmanuel Dellandréa, Francesc Moreno-Noguer, et al.. Combining Geometric, Textual and Visual Features for Predicting Prepositions in Image Descriptions. Conference on Empirical Methods in Natural Language Processing, Sep 2015, Lisbon, Portugal. pp.214-220, 2015. <hal-01375638>

Partager

Métriques

Consultations de la notice

45