RECIPE RECOGNITION WITH LARGE MULTIMODAL FOOD DATASET

Xin Wang; Devinder Kumar; Nicolas Thome; Matthieu Cord; Frederic Precioso

doi:10.1109/ICMEW.2015.7169757

Communication Dans Un Congrès Année : 2015

RECIPE RECOGNITION WITH LARGE MULTIMODAL FOOD DATASET

(1) , (1) , (1) , (1) , (2)

1
2

Xin Wang

Fonction : Auteur
PersonId : 970031

Machine Learning and Information Access

Devinder Kumar

Fonction : Auteur

Machine Learning and Information Access

Nicolas Thome

Fonction : Auteur
PersonId : 181803
IdHAL : nicolas-thome
ORCID : 0000-0003-4871-3045
IdRef : 12878332X

Machine Learning and Information Access

Matthieu Cord

Fonction : Auteur
PersonId : 13617
IdHAL : matthieucord
ORCID : 0000-0002-0627-5844
IdRef : 132968126

Machine Learning and Information Access

Frederic Precioso

Fonction : Auteur
PersonId : 9244
IdHAL : frederic-precioso
ORCID : 0000-0001-8712-1443
IdRef : 087273934

Laboratoire d'Informatique, Signaux, et Systèmes de Sophia-Antipolis (I3S) / Equipe KEIA

Résumé

This paper deals with automatic systems for image recipe recognition. For this purpose, we compare and evaluate leading vision-based and text-based technologies on a new very large multimodal dataset (UPMC Food-101) containing about 100,000 recipes for a total of 101 food categories. Each item in this dataset is represented by one image plus textual information. We present deep experiments of recipe recognition on our dataset using visual, textual information and fusion. Additionally, we present experiments with text-based embedding technology to represent any food word in a semantical continuous space. We also compare our dataset features with a twin dataset provided by ETHZ university: we revisit their data collection protocols and carry out transfer learning schemes to highlight similarities and differences between both datasets. Finally, we propose a real application for daily users to identify recipes. This application is a web search engine that allows any mobile device to send a query image and retrieve the most relevant recipes in our dataset.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

CEA_ICME2015.pdf (28.77 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

xin wang : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01196959

Soumis le : lundi 14 septembre 2015-13:11:40

Dernière modification le : lundi 26 février 2024-11:22:11

Archivage à long terme le : mardi 29 décembre 2015-00:10:30

Dates et versions

hal-01196959 , version 1 (14-09-2015)

Identifiants

HAL Id : hal-01196959 , version 1
DOI : 10.1109/ICMEW.2015.7169757

Citer

Xin Wang, Devinder Kumar, Nicolas Thome, Matthieu Cord, Frederic Precioso. RECIPE RECOGNITION WITH LARGE MULTIMODAL FOOD DATASET. IEEE International Conference on Multimedia & Expo (ICME), workshop CEA, Jun 2015, Turin, Italy. ⟨10.1109/ICMEW.2015.7169757⟩. ⟨hal-01196959⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UPMC CNRS I3S LIP6 UNIV-COTEDAZUR SORBONNE-UNIVERSITE SU-SCIENCES ANR

404 Consultations

947 Téléchargements

RECIPE RECOGNITION WITH LARGE MULTIMODAL FOOD DATASET

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager