Training compact deep learning models for video classification using circulant matrices

In real world scenarios, model accuracy is hardly the only factor to consider. Large models consume more memory and are computationally more intensive, which makes them difficult to train and to deploy, especially on mobile devices. In this paper, we build on recent results at the crossroads of Linear Algebra and Deep Learning which demonstrate how imposing a structure on large weight matrices can be used to reduce the size of the model. We propose very compact models for video classification based on state-of-the-art network architectures such as Deep Bag-of-Frames, NetVLAD and NetFisherVectors. We then conduct thorough experiments using the large YouTube-8M video classification dataset. As we will show, the circulant DBoF embedding achieves an excellent trade-off between size and accuracy.

Mots clés

Structured matrices Deep Learning Computer Vision Video Classification

Domaines

Informatique [cs] Apprentissage [cs.LG] Réseau de neurones [cs.NE]

Fichier principal

c_11.pdf (303.75 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

benjamin negrevergne : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02010093

Soumis le : mercredi 6 février 2019-18:30:54

Dernière modification le : vendredi 19 avril 2024-16:18:54

Archivage à long terme le : mardi 7 mai 2019-15:45:32

Dates et versions

hal-02010093 , version 1 (06-02-2019)

Identifiants

HAL Id : hal-02010093 , version 1
ARXIV : 1810.01140
DOI : 10.1007/978-3-030-11018-5_25

Citer

Alexandre Araujo, Benjamin Negrevergne, Yann Chevaleyre, Jamal Atif. Training compact deep learning models for video classification using circulant matrices. European Conference on Computer Vision, 2018, pp.271-286. ⟨10.1007/978-3-030-11018-5_25⟩. ⟨hal-02010093⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-DAUPHINE LAMSADE-DAUPHINE GENCI PSL

47 Consultations

68 Téléchargements