| The harmonic plus noise model (HNM) is widely used for spectral modelling of sounds that combine harmonic and noise components, like speech signals and signals produced by a series of musical instruments. A simplified and efficient version of the HNM, developed by Stylianou et al., splits the frequency band of the signal into two bands: a harmonic part for low frequencies and a noise-like part for high frequencies, separated by a time-varying cut-off frequency. In this study, we propose to model the time trajectories of the parameters of this HNM model for non-stationary signals, especially focusing on speech signals. This is done for time intervals up to several hundreds of milliseconds, thus significantly longer than usual short-term time frames used in analysis/synthesis models and in speech coders. The goal is to capture and exploit the long-term correlation of spectral components, as can appear across spectral parameters extracted from consecutive short-term frames. Previous works by Firouzmand et al. dealt with long-term parametric modelling in the more general framework of the sinusoidal model (i.e. long-term modelling of amplitude and phase parameters). We propose to extend this work to the HNM framework in order to obtain a complete long-term HNM model. In this latter case, the parameters to be modelled on the long-term basis are the spectral envelope (that encompasses the harmonic and noise regions), the fundamental frequency (which characterizes the harmonic region) and the cut-off frequency (which separates the harmonic and noise bands). To do this, the speech signal is first segmented into voiced (actually mixed voiced/unvoiced) sections and unvoiced sections, and a discrete cosine model is used for representing the time-trajectory of HNM parameters over each entire section. The proposed long-term HNM model can be used for music and speech analysis/synthesis. It enables joint compact representation of signals (thus a promising potential for low bit-rate coding) and easy signal manipulation directly from the long-term parameters (e.g. time stretching by direct interpolation). We present several experimentations to prove the efficiency of this model. For instance, the proposed long-term HNM is compared to the short-term version in terms of listening quality and data rate. |