Lip Reading with Hahn Convolutional Neural Networks moments

Abderrahim Mesbah; Hicham Hammouchi; Aissam Berrahou; Hassan Berbia; Hassan Qjidaa; Mohamed Daoudi

Article Dans Une Revue Image and Vision Computing Année : 2019

Lip Reading with Hahn Convolutional Neural Networks moments

(1) , (1, 2) , (2) , (2) , (1) , (3, 4)

1
2
3
4

Abderrahim Mesbah

Fonction : Auteur

Université Sidi Mohamed Ben Abdellah

Hicham Hammouchi

Fonction : Auteur

Université Sidi Mohamed Ben Abdellah

Université Mohammed V de Rabat [Agdal]

Aissam Berrahou

Fonction : Auteur

Université Mohammed V de Rabat [Agdal]

Hassan Berbia

Fonction : Auteur

Université Mohammed V de Rabat [Agdal]

Hassan Qjidaa

Fonction : Auteur

Université Sidi Mohamed Ben Abdellah

Mohamed Daoudi

Fonction : Auteur
PersonId : 170389
IdHAL : mohamed-daoudi
ORCID : 0000-0003-4219-7860
IdRef : 095749934

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Ecole nationale supérieure Mines-Télécom Lille Douai

Résumé

Lipreading or Visual speech recognition is the process of decoding speech from speakers mouth movements. It is used for people with hearing impairment , to understand patients attained with laryngeal cancer, people with vocal cord paralysis and in noisy environment. In this paper we aim to develop a visual-only speech recognition system based only on video. Our main targeted application is in the medical field for the assistance to la-ryngectomized persons. To that end, we propose Hahn Convolutional Neu-ral Network (HCNN), a novel architecture based on Hahn moments as first layer in the Convolutional neural network (CNN) architecture. We show that HCNN helps in reducing the dimensionality of video images, in gaining training time. HCNN model is trained to classify letters, digits or words given as video images. We evaluated the proposed method on three datasets, AVLetters, OuluVS2 and BBC LRW, and we show that it achieves significant results in comparison with other works in the literature.

Mots clés

Visual speech recognition Lipreading Laryngectomy Deep learning

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

Lip_Reading_with_Hahn_Convolutional_Neural_Networks.pdf (3.38 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Mohamed DAOUDI : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02109397

Soumis le : mercredi 24 avril 2019-20:50:02

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Dates et versions

hal-02109397 , version 1 (24-04-2019)

Identifiants

HAL Id : hal-02109397 , version 1

Citer

Abderrahim Mesbah, Hicham Hammouchi, Aissam Berrahou, Hassan Berbia, Hassan Qjidaa, et al.. Lip Reading with Hahn Convolutional Neural Networks moments. Image and Vision Computing, In press. ⟨hal-02109397⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS CRISTAL CRISTAL-3D-SAM UNIV-LILLE IMT-NORD-EUROPE CERI-SN

224 Consultations

528 Téléchargements

Lip Reading with Hahn Convolutional Neural Networks moments

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager