Speech/music discrimination based on wavelets for broadcast programs

Emmanuel Didiot 1 Irina Illina 1 Odile Mella 1 Dominique Fohr 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : The problem of speech/music discrimination is a challenging research problem which significantly impacts Automatic Speech Recognition (ASR) performance. This paper proposes new features for the Speech/Music discrimination task. We propose to use a decomposition of the audio signal based on wavelets, which allows a good analysis of non stationary signal like speech or music. We compute different energy types in each frequency band obtained from wavelet decomposition. Two class/non-class classifiers are used : one for speech/non-speech, one for music/non-music. On the broadcast test corpus, the proposed wavelet approach gives better results than the MFCC one. For instance, we have a significant relative improvements of the error rate of 39% for the speech/music discrimination task.
Type de document :
Communication dans un congrès
2006, INSTICC PRESS, pp.151, 2006
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-00103554
Contributeur : Emmanuel Didiot <>
Soumis le : mercredi 4 octobre 2006 - 16:30:25
Dernière modification le : jeudi 11 janvier 2018 - 06:19:55

Identifiants

  • HAL Id : hal-00103554, version 1

Collections

Citation

Emmanuel Didiot, Irina Illina, Odile Mella, Dominique Fohr, Jean-Paul Haton. Speech/music discrimination based on wavelets for broadcast programs. 2006, INSTICC PRESS, pp.151, 2006. 〈hal-00103554〉

Partager

Métriques

Consultations de la notice

288