On the use of voice descriptors for glottal source shape parameter estimation

Stefan Huber; Axel Röbel

doi:10.1016/j.csl.2013.09.006

Article Dans Une Revue Computer Speech and Language Année : 2014

On the use of voice descriptors for glottal source shape parameter estimation

(1, 2) , (3, 2)

1
2
3

Stefan Huber

Fonction : Auteur
PersonId : 935257

Institut de Recherche et Coordination Acoustique/Musique

Analyse et synthèse sonores [Paris]

Axel Röbel

Fonction : Auteur correspondant
PersonId : 4527
IdHAL : axel-roebel
ORCID : 0000-0001-6136-4391
IdRef : 227186079

Connectez-vous pour contacter l'auteur

Sound Analysis/Synthesis Team

Analyse et synthèse sonores [Paris]

Résumé

This paper summarizes the results of our investigations into estimating the shape of the glottal excitation source from speech signals. We employ the Liljencrants-Fant (LF) model describing the glottal flow and its derivative. The one-dimensional glottal source shape parameter Rd describes the transition in voice quality from a tense to a breathy voice. The parameter Rd has been derived from a statistical regression of the R waveshape parameters which parameterize the LF model. First, we introduce a variant of our recently proposed adaptation and range extension of the Rd parameter regression. Secondly, we discuss in detail the aspects of estimating the glottal source shape parameter Rd using the phase minimization paradigm. Based on the analysis of a large number of speech signals we describe the major conditions that are likely to result in erroneous Rd estimates. Based on these findings we investigate into means to increase the robustness of the Rd parameter estimation. We use Viterbi smoothing to suppress unnatural jumps of the estimated Rd parameter contours within short time segments. Additionally, we propose to steer the Viterbi algorithm by exploiting the covariation of other voice descriptors to improve Viterbi smoothing. The novel Viterbi steering is based on a Gaussian Mixture Model (GMM) that represents the joint density of the voice descriptors and the Open Quotient (OQ) estimated from corresponding electroglottographic (EGG) signals. A conversion function derived from the mixture model predicts OQ from the voice descriptors. Converted to Rd it defines an additional prior probability to adapt the partial probabilities of the Viterbi algorithm accordingly. Finally, we evaluate the performances of the phase minimization based methods using both variants to adapt and extent the Rd regression on one synthetic test set as well as in combination with Viterbi smoothing and each variant of the novel Viterbi steering on one test set of natural speech. The experimental findings exhibit improvements for both Viterbi approaches.

Mots clés

Glottal source LF model Viterbi smoothing Rd shape parameter Voice quality

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

Huber 2013 - On the use of voice descriptors for glottal source shape parameter estimation - CSL in press.pdf (2.32 Mo)

Origine : Accord explicite pour ce dépôt

Stefan Huber : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00865343

Soumis le : mardi 21 mai 2019-08:41:35

Dernière modification le : vendredi 24 mars 2023-14:53:02

Archivage à long terme le : lundi 30 septembre 2019-16:45:37

Dates et versions

hal-00865343 , version 1 (21-05-2019)

Identifiants

HAL Id : hal-00865343 , version 1
DOI : 10.1016/j.csl.2013.09.006

Citer

Stefan Huber, Axel Röbel. On the use of voice descriptors for glottal source shape parameter estimation. Computer Speech and Language, In press, 0885-2308, www.sciencedirect.com/science/article/pii/S0885230813000776. ⟨10.1016/j.csl.2013.09.006⟩. ⟨hal-00865343⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UPMC CNRS IRCAM STMS SORBONNE-UNIVERSITE SU-SCIENCES

174 Consultations

104 Téléchargements

On the use of voice descriptors for glottal source shape parameter estimation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager