Inner Lips Features Extraction based on CLNF with Hybrid Dynamic Template for Cued Speech

Li Liu 1 Gang Feng 2 Denis Beautemps 1
1 GIPSA-CRISSP - CRISSP
GIPSA-DPC - Département Parole et Cognition
2 GIPSA-VSLD - VSLD
GIPSA-DPC - Département Parole et Cognition
Abstract : In previous French Cued Speech (CS) studies, one of the widely used methods is painting blue color on the speaker’s lips to make lips feature extraction easier. In this paper, in order to get rid of this artifice, a novel automatic method to extract the inner lips contour of CS speakers is presented. This method is based on a recent facial contour extraction model developed in computer vision, called Constrained Local Neural Field (CLNF), which provides eight characteristic landmarks describing the inner lips contour. However, directly applied to our CS data, CLNF fails in about 41.4% of cases. Therefore, we propose two methods to correct the B parameter (aperture of inner lips) and A parameter (width of inner lips), respectively. For correcting the B parameter, a hybrid dynamic correlation template method (HD-CTM) using the first derivative of smoothed luminance variation is proposed. HD-CTM is first applied to detect the outer lower lips position. Then, the inner lower lips position is obtained by subtracting the validated lower lips thickness (VLLT). For correcting the A parameter, a periodical spline interpolation with a geometrical deformation of six CLNF inner lips landmarks is explored. Combined with an automatic round lips detector, this method is efficient to correct A parameter for round lips (the third vowel viseme made of French vowels with a small opening). HD-CTM is evaluated on 4800 images of three French speakers. It corrects about 95% CLNF errors of the B parameter, and total RMSE of one pixel (i.e., 0.05 cm on average) is achieved. The periodical spline interpolation method is tested on 927 round lips images. The total error of CLNF is reduced significantly, which is comparable to the state of the art. Moreover, the third viseme is properly distributed in the parameter A and B plane after using this method.
Document type :
Journal articles
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01663450
Contributor : Denis Beautemps <>
Submitted on : Wednesday, December 13, 2017 - 8:50:42 PM
Last modification on : Saturday, April 13, 2019 - 1:21:52 AM

Links full text

Identifiers

Collections

Citation

Li Liu, Gang Feng, Denis Beautemps. Inner Lips Features Extraction based on CLNF with Hybrid Dynamic Template for Cued Speech. EURASIP Journal on Image and Video Processing, Springer, 2017, ⟨10.1186/s13640-017-0233-y⟩. ⟨hal-01663450⟩

Share

Metrics

Record views

204