Inner Lips Features Extraction based on CLNF with Hybrid Dynamic Template for Cued Speech - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue EURASIP Journal on Image and Video Processing Année : 2017

Inner Lips Features Extraction based on CLNF with Hybrid Dynamic Template for Cued Speech

Résumé

In previous French Cued Speech (CS) studies, one of the widely used methods is painting blue color on the speaker’s lips to make lips feature extraction easier. In this paper, in order to get rid of this artifice, a novel automatic method to extract the inner lips contour of CS speakers is presented. This method is based on a recent facial contour extraction model developed in computer vision, called Constrained Local Neural Field (CLNF), which provides eight characteristic landmarks describing the inner lips contour. However, directly applied to our CS data, CLNF fails in about 41.4% of cases. Therefore, we propose two methods to correct the B parameter (aperture of inner lips) and A parameter (width of inner lips), respectively. For correcting the B parameter, a hybrid dynamic correlation template method (HD-CTM) using the first derivative of smoothed luminance variation is proposed. HD-CTM is first applied to detect the outer lower lips position. Then, the inner lower lips position is obtained by subtracting the validated lower lips thickness (VLLT). For correcting the A parameter, a periodical spline interpolation with a geometrical deformation of six CLNF inner lips landmarks is explored. Combined with an automatic round lips detector, this method is efficient to correct A parameter for round lips (the third vowel viseme made of French vowels with a small opening). HD-CTM is evaluated on 4800 images of three French speakers. It corrects about 95% CLNF errors of the B parameter, and total RMSE of one pixel (i.e., 0.05 cm on average) is achieved. The periodical spline interpolation method is tested on 927 round lips images. The total error of CLNF is reduced significantly, which is comparable to the state of the art. Moreover, the third viseme is properly distributed in the parameter A and B plane after using this method.

Dates et versions

hal-01663450 , version 1 (13-12-2017)

Identifiants

Citer

Li Liu, Gang Feng, Denis Beautemps. Inner Lips Features Extraction based on CLNF with Hybrid Dynamic Template for Cued Speech. EURASIP Journal on Image and Video Processing, 2017, 88, ⟨10.1186/s13640-017-0233-y⟩. ⟨hal-01663450⟩
140 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More