Lip Reading with Hahn Convolutional Neural Networks moments - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Image and Vision Computing Année : 2019

Lip Reading with Hahn Convolutional Neural Networks moments

Résumé

Lipreading or Visual speech recognition is the process of decoding speech from speakers mouth movements. It is used for people with hearing impairment , to understand patients attained with laryngeal cancer, people with vocal cord paralysis and in noisy environment. In this paper we aim to develop a visual-only speech recognition system based only on video. Our main targeted application is in the medical field for the assistance to la-ryngectomized persons. To that end, we propose Hahn Convolutional Neu-ral Network (HCNN), a novel architecture based on Hahn moments as first layer in the Convolutional neural network (CNN) architecture. We show that HCNN helps in reducing the dimensionality of video images, in gaining training time. HCNN model is trained to classify letters, digits or words given as video images. We evaluated the proposed method on three datasets, AVLetters, OuluVS2 and BBC LRW, and we show that it achieves significant results in comparison with other works in the literature.
Fichier principal
Vignette du fichier
Lip_Reading_with_Hahn_Convolutional_Neural_Networks.pdf (3.38 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02109397 , version 1 (24-04-2019)

Identifiants

  • HAL Id : hal-02109397 , version 1

Citer

Abderrahim Mesbah, Hicham Hammouchi, Aissam Berrahou, Hassan Berbia, Hassan Qjidaa, et al.. Lip Reading with Hahn Convolutional Neural Networks moments. Image and Vision Computing, In press. ⟨hal-02109397⟩
224 Consultations
528 Téléchargements

Partager

Gmail Facebook X LinkedIn More