Long-term modelling of parameters trajectories for the harmonic plus noise model of speech signals

Faten Ben Ali; Laurent Girin; Sonia Djaziri-Larbi

Communication Dans Un Congrès Année : 2010

Long-term modelling of parameters trajectories for the harmonic plus noise model of speech signals

(1, 2) , (3) , (4)

1
2
3
4

Faten Ben Ali

Fonction : Auteur
PersonId : 881860

Unité Signaux et Systèmes

Grenoble Images Parole Signal Automatique

Laurent Girin

Fonction : Auteur
PersonId : 3682
IdHAL : laurent-girin
ORCID : 0000-0002-9214-8760
IdRef : 088998037

GIPSA - Machines parlantes, Gestes oro-faciaux, Interaction Face-à-face, Communication augmentée

Sonia Djaziri-Larbi

Fonction : Auteur
PersonId : 183460
IdHAL : sonia-djaziri-larbi
ORCID : 0000-0003-3889-5981

Unité Signaux et Systèmes

Résumé

The harmonic plus noise model (HNM) is widely used for spectral modelling of sounds that combine harmonic and noise components, like speech signals and signals produced by a series of musical instruments. A simplified and efficient version of the HNM, developed by Stylianou et al., splits the frequency band of the signal into two bands: a harmonic part for low frequencies and a noise-like part for high frequencies, separated by a time-varying cut-off frequency. In this study, we propose to model the time trajectories of the parameters of this HNM model for non-stationary signals, especially focusing on speech signals. This is done for time intervals up to several hundreds of milliseconds, thus significantly longer than usual short-term time frames used in analysis/synthesis models and in speech coders. The goal is to capture and exploit the long-term correlation of spectral components, as can appear across spectral parameters extracted from consecutive short-term frames. Previous works by Firouzmand et al. dealt with long-term parametric modelling in the more general framework of the sinusoidal model (i.e. long-term modelling of amplitude and phase parameters). We propose to extend this work to the HNM framework in order to obtain a complete long-term HNM model. In this latter case, the parameters to be modelled on the long-term basis are the spectral envelope (that encompasses the harmonic and noise regions), the fundamental frequency (which characterizes the harmonic region) and the cut-off frequency (which separates the harmonic and noise bands). To do this, the speech signal is first segmented into voiced (actually mixed voiced/unvoiced) sections and unvoiced sections, and a discrete cosine model is used for representing the time-trajectory of HNM parameters over each entire section. The proposed long-term HNM model can be used for music and speech analysis/synthesis. It enables joint compact representation of signals (thus a promising potential for low bit-rate coding) and easy signal manipulation directly from the long-term parameters (e.g. time stretching by direct interpolation). We present several experimentations to prove the efficiency of this model. For instance, the proposed long-term HNM is compared to the short-term version in terms of listening quality and data rate.

Mots clés

Speech analysis speech modelling parametric models of speech signals

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Laurent Girin : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00534497

Soumis le : mardi 9 novembre 2010-18:44:25

Dernière modification le : jeudi 4 avril 2024-18:26:30

Dates et versions

hal-00534497 , version 1 (09-11-2010)

Identifiants

HAL Id : hal-00534497 , version 1

Citer

Faten Ben Ali, Laurent Girin, Sonia Djaziri-Larbi. Long-term modelling of parameters trajectories for the harmonic plus noise model of speech signals. ICA 2010 - 20th International Congress on Acoustics, Aug 2010, Sydney, Australia. pp.ICA2010. ⟨hal-00534497⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS GIPSA GIPSA-DPC GIPSA-MAGIC

164 Consultations

0 Téléchargements

Long-term modelling of parameters trajectories for the harmonic plus noise model of speech signals

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager