Speech/music discrimination based on wavelets for broadcast programs

Emmanuel Didiot 1 Irina Illina 1 Odile Mella 1 Dominique Fohr 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : The problem of speech/music discrimination is a challenging research problem which significantly impacts Automatic Speech Recognition (ASR) performance. This paper proposes new features for the Speech/Music discrimination task. We propose to use a decomposition of the audio signal based on wavelets, which allows a good analysis of non stationary signal like speech or music. We compute different energy types in each frequency band obtained from wavelet decomposition. Two class/non-class classifiers are used : one for speech/non-speech, one for music/non-music. On the broadcast test corpus, the proposed wavelet approach gives better results than the MFCC one. For instance, we have a significant relative improvements of the error rate of 39% for the speech/music discrimination task.
Document type :
Conference papers
Liste complète des métadonnées

Contributor : Emmanuel Didiot <>
Submitted on : Wednesday, October 4, 2006 - 4:30:25 PM
Last modification on : Thursday, January 11, 2018 - 6:19:55 AM


  • HAL Id : hal-00103554, version 1



Emmanuel Didiot, Irina Illina, Odile Mella, Dominique Fohr, Jean-Paul Haton. Speech/music discrimination based on wavelets for broadcast programs. 2006, pp.151. ⟨hal-00103554⟩



Record views