Inner Lips Features Extraction based on CLNF with Hybrid Dynamic Template for Cued Speech

Li Liu; Gang Feng; Denis Beautemps

doi:10.1186/s13640-017-0233-y

Article Dans Une Revue EURASIP Journal on Image and Video Processing Année : 2017

Inner Lips Features Extraction based on CLNF with Hybrid Dynamic Template for Cued Speech

(1) , (2) , (1)

1
2

Li Liu

Fonction : Auteur
PersonId : 17735
IdHAL : li-liu
ORCID : 0000-0002-4497-0135

GIPSA - Cognitive Robotics, Interactive Systems, & Speech Processing

Gang Feng

Fonction : Auteur
PersonId : 882696

GIPSA - Voix Systèmes Linguistiques et Dialectologie

Denis Beautemps

Fonction : Auteur
PersonId : 18206
IdHAL : denis-beautemps
ORCID : 0000-0001-9625-3018
IdRef : 099427524

GIPSA - Cognitive Robotics, Interactive Systems, & Speech Processing

Résumé

In previous French Cued Speech (CS) studies, one of the widely used methods is painting blue color on the speaker’s lips to make lips feature extraction easier. In this paper, in order to get rid of this artifice, a novel automatic method to extract the inner lips contour of CS speakers is presented. This method is based on a recent facial contour extraction model developed in computer vision, called Constrained Local Neural Field (CLNF), which provides eight characteristic landmarks describing the inner lips contour. However, directly applied to our CS data, CLNF fails in about 41.4% of cases. Therefore, we propose two methods to correct the B parameter (aperture of inner lips) and A parameter (width of inner lips), respectively. For correcting the B parameter, a hybrid dynamic correlation template method (HD-CTM) using the first derivative of smoothed luminance variation is proposed. HD-CTM is first applied to detect the outer lower lips position. Then, the inner lower lips position is obtained by subtracting the validated lower lips thickness (VLLT). For correcting the A parameter, a periodical spline interpolation with a geometrical deformation of six CLNF inner lips landmarks is explored. Combined with an automatic round lips detector, this method is efficient to correct A parameter for round lips (the third vowel viseme made of French vowels with a small opening). HD-CTM is evaluated on 4800 images of three French speakers. It corrects about 95% CLNF errors of the B parameter, and total RMSE of one pixel (i.e., 0.05 cm on average) is achieved. The periodical spline interpolation method is tested on 927 round lips images. The total error of CLNF is reduced significantly, which is comparable to the state of the art. Moreover, the third viseme is properly distributed in the parameter A and B plane after using this method.

Mots clés

CLNF Luminance variation HD-CTM Periodical spline interpolation Inner lips contour parameters Cued Speech Visemes

Domaines

Sciences de l'information et de la communication

Denis Beautemps : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01663450

Soumis le : mercredi 13 décembre 2017-20:50:42

Dernière modification le : jeudi 4 avril 2024-21:14:19

Dates et versions

hal-01663450 , version 1 (13-12-2017)

Identifiants

HAL Id : hal-01663450 , version 1
DOI : 10.1186/s13640-017-0233-y

Citer

Li Liu, Gang Feng, Denis Beautemps. Inner Lips Features Extraction based on CLNF with Hybrid Dynamic Template for Cued Speech. EURASIP Journal on Image and Video Processing, 2017, 88, ⟨10.1186/s13640-017-0233-y⟩. ⟨hal-01663450⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS GIPSA GIPSA-DPC GIPSA-VSLD

140 Consultations

0 Téléchargements

Inner Lips Features Extraction based on CLNF with Hybrid Dynamic Template for Cued Speech

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager