Fast Text/non-Text Image Classification with Knowledge Distillation - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Fast Text/non-Text Image Classification with Knowledge Distillation

Miao Zhao
  • Fonction : Auteur
Rui-Qi Wang
  • Fonction : Auteur
Fei Yin
  • Fonction : Auteur
Xu-Yao Zhang
  • Fonction : Auteur
Lin-Lin Huang
  • Fonction : Auteur

Résumé

How to efficiently judge whether a natural image contains texts or not is an important problem. Since text detection and recognition algorithms are usually time-consuming, and it is unnecessary to run them on images that do not contain any texts. In this paper, we investigate this problem from two perspectives: the speed and the accuracy. First, to achieve high speed for efficient filtering large number of images especially on CPU, we propose using small and shallow convolutional neural network, where the features from different layers are adaptively pooled into certain sizes to overcome difficulties caused by multiple scales and various locations. Although this can achieve high speed but its accuracy is not satisfactory due to limited capacity of small network. Therefore, our second contribution is using the knowledge distillation to improve the accuracy of the small network, by constructing a larger and deeper neural network as teacher network to instruct the learning process of the small network. With the above two strategies, we can achieve both high speed and high accuracy for filtering scene text images. Experimental results on a benchmark dataset have shown the effectiveness of our method: the teacher network yields state-of-the-art performance, and the distilled small network achieves high performance while maintaining high speed which is 176 times faster on CPU and 3.8 times faster on GPU than a compared benchmark method.
Fichier principal
Vignette du fichier
Zhao2019.pdf (210.57 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03030201 , version 1 (06-05-2022)

Licence

Paternité - Pas d'utilisation commerciale

Identifiants

Citer

Miao Zhao, Rui-Qi Wang, Fei Yin, Xu-Yao Zhang, Lin-Lin Huang, et al.. Fast Text/non-Text Image Classification with Knowledge Distillation. International Conference on Document Analysis and Recognition (ICDAR) 2019, Sep 2019, Sydney, Australia. pp.1458-1463, ⟨10.1109/ICDAR.2019.00234⟩. ⟨hal-03030201⟩

Collections

L3I UNIV-ROCHELLE
36 Consultations
74 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More